skip to Main Content

I’m running a case when exists to try to filter out anamoly data from my table, but I am running into a unsupported correlated query error.

I have a table of anomalies for different metrics and I want to null those metric values out in my final results if they’re in this anomaly table.

I have a table, anomalytable, which stores an id, the metric type of the anomaly, and the date range which this occurs.

| id   | startdate | enddate | metrictype |
| ---- | --------- | ------- | -----------|
| 1    | 1/1/23    | 1/3/23  | metric1    |
| 2    | 1/3/23    | 1/5/23  | metric2    |

I have a results table where I want to null the metrics for if they belong in the above table.

| id   | metric1   | metric2 |  date  |
| ---- | --------- | ------- | -------|
| 1    | 300       | 1.2     | 1/2/23 |
| 2    | 1.1       | 500     | 1/4/23 |

What I’m hoping my final results will look like (as you can see, the 300 and 500 anomaly numbers were nulled out)

| id   | metric1   | metric2 | date   |
| ---- | --------- | ------- | -------|
| 1    | null      | 1.2     | 1/2/23 |
| 2    | 1.1       | null    | 1/4/23 |

My query is below

select 
case when exists 
   (select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric1') 
then null else a.metric1 end, 
case when exists 
   (select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric2') 
then null else a.metric2 end
from resultstable a

However everytime I run this I get the correlated subquery pattern not supported error. I’ve read through the redshift unsupported correlated queries and can’t see what rule I’m breaking. Is it possible to rewrite this with joins in a clean fashion?

2

Answers


  1. Use a left join instead. Simpler and likely faster.

    select
      r.id,
      case when a.metricname = 'metric1' then null else r.metric1val end,
      case when a.metricname = 'metric2' then null else r.metric2val end,
      r.date
    from resultstable r
    left join anomalytable a on r.id = a.id and r.date between a.startdate and a.enddate
    
    Login or Signup to reply.
  2. This is a correlated subquery due to the changing results of the subquery based on the row selected in the top query. This is better done with a join – fewer table scans.

    select a.id, 
       case when b.metrictype = 'metric1' then null
         else metric1 end metric1,
       case when b.metrictype = 'metric2' then null
         else metric2 end metric2,
       a."date"
    from resultstable a
    join anomalytable b
    on a.id = b.id and a.date between b.startdate and b.enddate
    ;
    

    Here’s a fiddle to try it out: http://sqlfiddle.com/#!17/e677c/6

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search