Postgresql - Is there any simple way to transform jsonb field into csv?

user3390963
June 18, 2024
233 views
0 votes
2 Answers

I have a model with a jsonb field where responses from a third-party service are stored. The response changes periodically; some fields are added, some disappear. I want to generate a CSV from a sample of records. To do this, I need to gather all possible fields from the sample and turn them into columns in the CSV file.

Answers

- noabnch
- June 18, 2024 at 4:58 pm
- 0 votes
0
To achieve this, you’ll need to iterate over your sample records, extract all unique keys from the JSONB field, and then use those keys to create columns in your CSV file. Here’s a step-by-step approach you can follow in Python:
```
    import json
    import csv
    from collections import defaultdict

    # Sample records with JSONB field
    sample_records = [
        {"id": 1, "data": {"key1": "value1", "key2": "value2"}},
        {"id": 2, "data": {"key2": "value2", "key3": "value3"}},
        {"id": 3, "data": {"key1": "value1", "key4": "value4"}}
    ]

    # Extract all unique keys from the JSONB field
    all_keys = set()
    for record in sample_records:
        data = record.get("data", {})
        all_keys.update(data.keys())

    # Create CSV file and write header
    with open('output.csv', 'w', newline='') as csvfile:
        fieldnames = ['id'] + list(all_keys)  # Include 'id' column and unique keys
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()

        # Write data rows
        for record in sample_records:
            row = {'id': record['id']}
            data = record.get("data", {})
            row.update(data)  # Add data fields
            writer.writerow(row)

    print("CSV file generated successfully.")
```
This script iterates over each record in the sample, extracts the keys from the JSONB field, and updates a set containing all unique keys. Then, it creates a CSV file and writes the header with the ‘id’ column and all unique keys as column names. Finally, it iterates over the sample records again, writes each record to the CSV file with corresponding values for each column.

You can replace sample_records with your actual sample data retrieved from the database. Additionally, ensure that you handle any potential errors or edge cases based on your specific requirements and the nature of your data.
Login or Signup to reply.

Pivoting rows to columns is painful because SQL needs to know the schema of SELECT before it starts. Since a CSV file is just data, I have done something along these lines before:

with invals (id, jdata) as (
  values 
  (1, '{"a": 11, "b": 12, "c": 12}'::jsonb),
  (2, '{"d": 24, "e": 25, "f": 26}'::jsonb),
  (3, '{"a": 31, "c": 33, "e": 35}'::jsonb),
  (4, '{"a": 41, "b": 42, "c": 43, "d": 44, "f": 46}'::jsonb)
), allkeys as (
  select distinct k 
    from invals 
         cross join lateral jsonb_each(jdata) as e(k,v)
)
select -999 as id, string_agg(k, ',' order by k) as d
  from allkeys
union all
select id, string_agg(coalesce(i.jdata->>k, ''), ',' order by k)
  from invals i
       cross join allkeys a
 group by id
 order by id
;

Working fiddle

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Is there any simple way to transform jsonb field into csv?

Answers