skip to Main Content

I need to parse data from Kafka in ClickHouse using Kafka Engine, as example, I have created a table, in https://fiddle.clickhouse.com/0e89bec6-4e76-410a-9fc4-cf58ace5f34f,

CREATE TABLE json(name String, data Array(Map(String, String)) ) ENGINE = Memory;

INSERT INTO JSON FORMAT JSONEachRow {"name": "asd", "data":[{"id":"1", "name": "test1"},{"id":"2", "name": "test2"}]};

And I got 2 columns

name  data 
asd   [{'id':'1', 'name': 'test1'},{'id':'2', 'name': 'test2'}]

How to transform to take the following result?

name  id  name
asd    1  test1
asd    2  test2

2

Answers


  1. Does something like this work for what you need? mapApply joins the name column with the data column, and arrayJoin applies it to each element in data:

    SELECT 
       name, 
       arrayJoin(mapValues(mapApply((k,v) -> (name,v), arrayJoin(data)))) AS id 
    FROM json;
    

    The response looks like:

    ┌─name─┬─id─┐
    │ asd  │ 1  │
    │ asd  │ 2  │
    └──────┴────┘
    
    Login or Signup to reply.
  2. Instead of storing the JSON as a Map of arrays, you could parse that data out into individual rows. This might make your queries much easier to write.

    You could do something like this to each row as it’s being inserted:

    WITH 
        '{"name": "asd", "data":[{"id":"1", "name2":"test1"},{"id":"2","name2":"test2"}]}' AS json
    SELECT 
        JSONExtract(json,'name', 'String') AS name,
        arrayJoin(JSONExtract(json,'data', 'Array(Tuple(String,String))')) AS row,
        row.1 AS id,
        row.2 AS name2;
    

    The result is:

    ┌─name─┬─row───────────┬─id─┬─name2─┐
    │ asd  │ ('1','test1') │ 1  │ test1 │
    │ asd  │ ('2','test2') │ 2  │ test2 │
    └──────┴───────────────┴────┴───────┘
    

    (You can ignore the row column of that response, and insert the other three columns into your table)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search