skip to Main Content

I have a my_table like this. Col2 is a string type:

| Col1         | Col2                   |
|--------------|------------------------|
| some_string  |{key1:0, key2:1, key3:0}|
| some_string  |{key1:0, key2:0, key3:0}|
| some_string  |{key1:1, key2:2, key3:3}|
| some_string  |{key1:1, key2:1, key3:0}|

I would like to use SQL to parse the key:value pairs on Col2, and filter out pairs where the value is "0" so that the output is:

| Col1         | Col2                   |
|--------------|------------------------|
| some_string  |{key2:1}                |
| some_string  |{}                      |
| some_string  |{key1:1, key2:2, key3:3}|
| some_string  |{key1:1, key2:1}        |

Would someone help?

2

Answers


  1. Here’s an approach you can take using Redshift SQL:

    1. Use string functions to split Col2 into its component key-value pairs.
    2. Filter out the pairs with a value of "0".
    3. Reconstruct the string in the JSON-like format.

    This is a bit complex because Redshift doesn’t support dynamic string parsing into columns or rows as easily as newer SQL databases. Typically, you’d handle such data in the application layer or use a database with JSON processing capabilities.

    SELECT Col1,
           '{' || LISTAGG(Case When part_value != '0' Then part_key || ':' || part_value End, ',') WITHIN GROUP (ORDER BY part_key) || '}' AS Col2
    FROM (
        SELECT Col1,
               SPLIT_PART(SPLIT_PART(Col2, ',', n.n), ':', 1) AS part_key,
               SPLIT_PART(SPLIT_PART(Col2, ',', n.n), ':', 2) AS part_value
        FROM my_table
        CROSS JOIN (SELECT 1 n UNION SELECT 2 UNION SELECT 3) n 
        WHERE Col2 LIKE '%:%' 
    ) sub
    GROUP BY Col1;
    
    Login or Signup to reply.
  2. Since Redshift is a Postgres-based data warehouse This query might be working for you.

    Initially, we plit the Col2 into key-value pairs, then filter out elements that have a value of 0, and finally, we rebuild Col2 :

    WITH cte AS (
       SELECT Col1, Col2,
           TRANSLATE(Col2, '{ }', '') AS FDKJ,
           UNNEST(STRING_TO_ARRAY(TRANSLATE(Col2, '{ }', ''), ',')) AS Col2_splitted
       FROM my_table
    )
    SELECT Col1, Col2, '{' || STRING_AGG(col2_splitted, ', ') || '}' AS Col2_filtered
    FROM cte
    WHERE split_part(col2_splitted, ':', 2) <> '0'
    GROUP BY Col1, Col2
    UNION ALL
    SELECT Col1, Col2, '{}'
    FROM cte
    GROUP BY Col1, Col2
    HAVING COUNT(CASE WHEN split_part(col2_splitted, ':', 2) <> '0' THEN 1 END) = 0;
    

    Results :

    col1        col2                        col2_filtered
    some_string {key1:0, key2:1, key3:0}    {key2:1}
    some_string {key1:1, key2:2, key3:3}    {key1:1, key2:2, key3:3}
    some_string {key1:1, key2:1, key3:0}    {key1:1, key2:1}
    some_string {key1:0, key2:0, key3:0}    {}
    

    Demo on PostgreSql

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search