Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

having trouble converting csv to json using bash jq

L_sidd
December 22, 2022
276 views
4 votes
5 Answers

I have one stream output stored in csv file, I need help converting csv to json:

my csv looks like:

cat output.csv
"k","a1",1,"b1","c1","d1",1
"l","a2",2,"b2","c2","d2",2
"m","a3",3,"b3","c3","d3",3
"n","a4",4,"b4","c4","d4",4
"o","a5",5,"b5","c5","d5",5

Required output:

note: I need key configuration to be added to json.

{
    "configuration": {
      "k": {
        "a": "a1",
        "number1": "1",
        "c": "b1",
        "d": "c1",
        "e": "d1",
        "number2": "1"
      },
      "l": {
        "a": "a2",
        "number1": "2",
        "c": "b2",
        "d": "c2",
        "e": "d2",
        "number2": "2"
      },
      .
      .
      .
    }
}

So far tried with jq:

my function is:

cat api.jq

[
  inputs |
     split(",") |
     map(ltrimstr(""")) |
     map(rtrimstr(""")) |
      {
        a: .[1],
        number1: .[2],
        c: .[3],
        d: .[4],
        e: .[5],
        number2: .[6]
      }
] | {configuration: .}

Output:


jq -nRf api.jq output.csv

{
  "cluster_configuration": [
    {
      "a": "a1",
      "number1": "1",
      "c": "b1",
      "d": "c1",
      "e": "d1",
      "number2": "1"
    },
    {
      "a": "a2",
      "number1": "2",
      "c": "b2",
      "d": "c2",
      "e": "d2",
      "number2": "2"
    },
    {
      "a": "a3",
      "number1": "3",
      "c": "b3",
      "d": "c3",
      "e": "d3",
      "number2": "3"
    },
    {
      "a": "a4",
      "number1": "4",
      "c": "b4",
      "d": "c4",
      "e": "d4",
      "number2": "4"
    },
    {
      "a": "a5",
      "number1": "5",
      "c": "b5",
      "d": "c5",
      "e": "d5",
      "number2": "5"
    }
  ]
}

Answers

jq’s from_entries can be used to generate objects with chosen keys.

In the below example, we first use Miller to convert CSV to JSON more robustly (in a manner that supports values with commas or quotes) before proceeding with jq

Add a header line at the top so Miller knows what key name to associate with each value:
```
cat <(echo k,a,number1,c,d,e,number2) output.csv > output_with_header.csv
```

Convert the csv to json with miller:

mlr --icsv --ojson cat output_with_header.csv > output.json

transform with jq, generating lists of maps with key and value elements and then combining them with from_entries:
```
jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}' output.json
```

This results in:

{
  "configuration": {
    "k": {
      "a": "a1",
      "number1": 1,
      "c": "b1",
      "d": "c1",
      "e": "d1",
      "number2": 1
    },
    "l": {
      "a": "a2",
      "number1": 2,
      "c": "b2",
      "d": "c2",
      "e": "d2",
      "number2": 2
    },
    "m": {
      "a": "a3",
      "number1": 3,
      "c": "b3",
      "d": "c3",
      "e": "d3",
      "number2": 3
    },
    "n": {
      "a": "a4",
      "number1": 4,
      "c": "b4",
      "d": "c4",
      "e": "d4",
      "number2": 4
    },
    "o": {
      "a": "a5",
      "number1": 5,
      "c": "b5",
      "d": "c5",
      "e": "d5",
      "number2": 5
    }
  }
}

All together as a oneliner:

cat <(echo k,a,number1,c,d,e,number2) output.csv | mlr --icsv --ojson cat | jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}'

Insofar as your goal is to make a be a key, using from_entries is suitable for that:

[
    inputs |
    split(",") |
    map(ltrimstr(""")) |
    map(rtrimstr(""")) |
    {
        "key": .[1],
        "value": {
            number: .[2],
            c: .[3],
            d: .[4],
            e: .[5],
            number: .[6]
        }
    }
] |
from_entries |
{ configuration: . }

When run with

jq -R -f api.jq <output.csv

…the output is:

{
  "configuration": {
    "a2": {
      "number": "2",
      "c": "b2",
      "d": "c2",
      "e": "d2"
    },
    "a3": {
      "number": "3",
      "c": "b3",
      "d": "c3",
      "e": "d3"
    },
    "a4": {
      "number": "4",
      "c": "b4",
      "d": "c4",
      "e": "d4"
    },
    "a5": {
      "number": "5",
      "c": "b5",
      "d": "c5",
      "e": "d5"
    }
  }
}

If robustness of CSV parsing is a concern, you could easily adapt
the parser at rosettacode.org. The following converts the CSV rows to JSON arrays; since the "main" program below uses inputs, you’d use the -R and -n command-line options.

## The PEG * operator:
def star(E): (E | star(E)) // . ;

## Helper functions:

# Consume a regular expression rooted at the start of .remainder, or emit empty;
# on success, update .remainder and set .match but do NOT update .result
def consume($re):
  # on failure, match yields empty
  (.remainder | match("^" + $re)) as $match
  | .remainder |= .[$match.length :]
  | .match = $match.string;

def parse($re):
  consume($re)
  | .result = .result + [.match] ;

def ws: consume(" *");

### Parse a string into comma-separated values

def quoted_field_content:
  parse("(("")|([^"]))*")
  | .result[-1] |= gsub(""""; """);

def unquoted_field: parse("[^,"]*");

def quoted_field: consume(""") | quoted_field_content | consume(""");

def field: (ws | quoted_field | ws) // unquoted_field;

def record: field | star(consume(",") | field);

def csv2array:
  {remainder: .} | record | .result;

inputs | csv2array

- StephenQuan
- December 23, 2022 at 12:15 am
- 0 votes
0
I know you raise this question as a bash+jq question, but, if it was a bash+python question, the solution would be trivial:
```
# csv2json.py
import sys, csv, json
data = { "configuration": { } }
for [k,a,n1,c,d,e,n2] in csv.reader(sys.stdin.readlines()):
    data["configuration"][k] = { "a": a, "number1": n1, "c": c, "d": d, "e": e, "number2": n2 }
print(json.dumps(data, indent=2))
```
Then, in bash (I’m assuming Ubuntu here), we could go:
```
python3 csv2json.py < output.csv
```
Login or Signup to reply.

- Fravadona
- December 23, 2022 at 12:31 am
- 0 votes
0
Here’s a possible solution with Miller (available here for several OSs), an interesting tool that supports multiple input/output formats:
```
mlr --icsv -N put -q '
    @map[$1] = {"a": $2, "number1": $3, "c": $4, "d": $5, "e": $6, "number2": $7};
    end { dump { "configuration": @map } }
' file.csv
```
```
{
  "configuration": {
    "k": {
      "a": "a1",
      "number1": 1,
      "c": "b1",
      "d": "c1",
      "e": "d1",
      "number2": 1
    },
    "l": {
...
```
note: for forcing the numbers to be treated as strings you can use the --infer-none option.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.