skip to Main Content

I apologize in advance, I am certainly not a db expert, so might be asking something obvious here. I have a large postgreql table with the following (example) data :

CREATE TABLE t
    (ID int, componentid varchar(30), scanresult varchar(30), eventdatetime timestamp)
;
    
INSERT INTO t
    (ID, componentid, scanresult, eventdatetime)
VALUES
    (1, 'A', '123', '2018-02-20 00:00:00'),
    (2, 'A', '345','2018-02-20 00:00:01'),
    (3, 'A', '?','2018-02-20 00:00:02'),
    (4, 'A', '54','2018-02-21 00:00:01'),
    (5, 'A', '?','2018-02-21 00:00:02'),
    (6, 'B', '?','2018-02-21 00:00:03'),
    (7, 'B', '457645','2018-02-22 00:00:01'),
    (8, 'B', '?','2018-02-22 00:00:02'),
    (9, 'B', '56465','2018-02-22 00:00:03'),
    (10, 'C', '?','2018-02-23 00:00:01'),
    (11, 'C', '234234','2018-02-21 00:00:03'),
    (12, 'C', '33','2018-02-22 00:00:01'),
    (13, 'C', '55','2018-02-22 00:00:02'),
    (14, 'C', '?','2018-02-22 00:00:03'),
    (15, 'C', '?','2018-02-23 00:00:01'),
    (16, 'C', '?','2018-02-23 00:00:01'),
    (17, 'C', '?','2018-02-23 00:00:02'),
    (18, 'C', '?','2018-02-23 00:00:03');
;

What I trying to achieve is a query that counts the nummer of occurences -per component- that have a scanresult of ?. However, I must ONLY count if the latest and consecutive scanresults are ?

To make clear what I mean, for my example I am expecting:

  • for component A : 1, because the last scanresult ?
  • for component B : 0, last scan is not ?
  • for component C : 5, because the last 5 scanresults are ? (the first ? is followed by a normal scanresult, ergo the records are not consecutive)

Can any of you help me out here?

http://sqlfiddle.com/#!17/643d6b/2

3

Answers


  1. Chosen as BEST ANSWER

    @zegarek . I just found something weird. When is use these number:

    INSERT INTO t (id, componentid, scanresult, eventdatetime)
    VALUES
        (1, 'A', '123', '2018-02-20 00:00:00'),
        (2, 'A', '345','2018-02-20 00:00:01'),
        (3, 'A', '?','2018-02-20 00:00:02'),
        (4, 'A', '?','2018-02-21 00:00:01'),
        (5, 'A', '35','2018-02-21 00:00:02'),
        (6, 'B', '?','2018-02-21 00:00:03'),
        (7, 'B', '457645','2018-02-22 00:00:01'),
        (8, 'B', '?','2018-02-22 00:00:02'),
        (9, 'B', '?','2018-02-22 00:00:03'),
        (10, 'C', '?','2018-02-23 00:00:01'),
        (11, 'C', '234234','2018-02-21 00:00:03'),
        (12, 'C', '33','2018-02-22 00:00:01'),
        (13, 'C', '55','2018-02-22 00:00:02'),
        (14, 'C', '?','2018-02-22 00:00:03'),
        (15, 'C', '343434','2018-02-23 00:00:01'),
        (16, 'C', '?','2018-02-23 00:00:01'),
        (17, 'C', '?','2018-02-23 00:00:02'),
        (18, 'C', '?','2018-02-23 00:00:03')|
    
    The result says:
    A   0
    B   2
    C   2
    

    The C result is off.... Do you have any idea why that is?


    1. Determine continuity by comparing to lag() or lead() window
      functions
    2. Use a stepping sum to only target the latest sequence of
      bad scans
    3. Count the rows in that latest sequence, per component_id.
    select componentid, count(*)filter(where is_latest_bad_scan_chain) 
    from (  select *,(0=sum(continuity_counter)over w2) as is_latest_bad_scan_chain
            from (select *, case when '?'=scanresult
                                 and  '?'=lag(scanresult,1,'?')over w1 
                            then 0 else 1 end as continuity_counter
                  from t
                  window w1 as(partition by componentid order by eventdatetime desc)
                  order by componentid,id )
            window w2 as(partition by componentid order by eventdatetime desc)
            order by componentid,id ) a
    group by componentid;
    
    componentid count
    A 1
    B 0
    C 6

    db<>fiddle

    It’s not possible to directly use a window function in a filter clause, or nest window functions directly, hence the subqueries. If you’re not a fan of indentation, they can be replaced with CTEs.

    Login or Signup to reply.
  2. Number your rows per component backwards, so as to have the last entry numbered 1, the second last 2, etc. Then iterate through a component’s rows as long as the barcode is a ‘?’ (that is a recursive CTE in SQL). Then aggregate and count.

    with recursive
      numbered as
      (
        select
          componentid, dt, barcode,
          row_number() over (partition by componentid order by dt desc, id desc) as rn
        from t
      ),
      results as
      (
        select componentid, dt, barcode, rn from numbered where rn = 1
        union all
        select n.componentid, n.dt, n.barcode, n.rn
        from results  r
        join numbered n on n.componentid = r.componentid
                       and n.rn = r.rn + 1
                       and n.barcode = '?'
        where r.barcode = '?'
      )
    select componentid, count(*) filter (where barcode = '?')
    from results
    group by componentid
    order by componentid;
    

    Demo: http://sqlfiddle.com/#!17/ca1a4c/43

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search