I am trying to convert SAFECAST API JSON to a CSV suitable for spreadsheet programs like Excel etc. to prepare some data summaries etc. I have semi-manual WGET script to download several JSON files based on my filters.
I am able to convert single JSON to usable CSV using jq (Ubuntu linux):
cat bgeigie_imports_001.json | jq -r '["id", "user_id", "source url", "status", "submitted", "measurements_count", "approved", "created_at", "updated_at", "name", "description", "lines_count", "credits", "height", "orientation", "cities", "subtype", "rejected"], (.[] | [.id, .user_id, .source.url, .status, .submitted, .measurements_count, .approved, .created_at, .updated_at, .name, .description, .lines_count, .credits, .height, .orientation, .cities, .subtype, .rejected]) | @csv ' > bgeigie_imports_001.csv
but I am unable to make it work automatically and convert all json files in the folder (this thread did not help).
any ideas? Loading the file names for input/output from a txt file might also help as I have a list from the wget script…
appreciate any advice, unfortunately my skills are limited
thanks
PS: merging the json first using jq -s '.' *.json > All_data.json
produces somehow different file and the jq command above fails to work properly
2
Answers
From your post it’s quite unclear how many JSON files you have, how they correlate to the API link and what they contain, so I can’t really produce a tested working example; you also mentioned the final goal of a single CSV? I’d expect e.g. the following to work for you:
Bellow is a bash script you can debug line by line.
Validate once the single output file is generated.
created_at
).More complicated script can remove duplicates and sort by time.
Sample bash script