I’m running a case when exists
to try to filter out anamoly data from my table, but I am running into a unsupported correlated query
error.
I have a table of anomalies for different metrics and I want to null those metric values out in my final results if they’re in this anomaly table.
I have a table, anomalytable
, which stores an id, the metric type of the anomaly, and the date range which this occurs.
| id | startdate | enddate | metrictype |
| ---- | --------- | ------- | -----------|
| 1 | 1/1/23 | 1/3/23 | metric1 |
| 2 | 1/3/23 | 1/5/23 | metric2 |
I have a results table where I want to null the metrics for if they belong in the above table.
| id | metric1 | metric2 | date |
| ---- | --------- | ------- | -------|
| 1 | 300 | 1.2 | 1/2/23 |
| 2 | 1.1 | 500 | 1/4/23 |
What I’m hoping my final results will look like (as you can see, the 300 and 500 anomaly numbers were nulled out)
| id | metric1 | metric2 | date |
| ---- | --------- | ------- | -------|
| 1 | null | 1.2 | 1/2/23 |
| 2 | 1.1 | null | 1/4/23 |
My query is below
select
case when exists
(select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric1')
then null else a.metric1 end,
case when exists
(select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric2')
then null else a.metric2 end
from resultstable a
However everytime I run this I get the correlated subquery pattern not supported
error. I’ve read through the redshift unsupported correlated queries and can’t see what rule I’m breaking. Is it possible to rewrite this with joins in a clean fashion?
2
Answers
Use a left join instead. Simpler and likely faster.
This is a correlated subquery due to the changing results of the subquery based on the row selected in the top query. This is better done with a join – fewer table scans.
Here’s a fiddle to try it out: http://sqlfiddle.com/#!17/e677c/6