skip to Main Content

I have a file that contains this format:

Cus Id: 1234
Cus Name: 10:/John Parks
Cus Type: temporary
Cus client: tesla;toyota
Dept Id: 111

Cus Id: 1235
Cus Name: 10:/William Parks
Cus Type: temporary
Cus client:
Dept Id: 222

How can I convert this to JSON format? Any methods bash, jq, or python is fine.

[
  {
    "Cus Id": 1234,
    "Cus Name": "10:/John Parks",
    "Cus Type": "temporary",
    "Cus client": "tesla;toyota",
    "Dept Id": 111
  },
  { 
    "Cus Id": 1235,
    "Cus Name": "10:/William Parks",
    "Cus Type": "temporary",
    "Cus client": "null",
    "Dept Id": 222
  }
] 

3

Answers


  1. jq -Rs '                        # 1. -Rs = read the input into a single string
        split("n{2,}"; "")         # 2. split on sequences of blank lines
        | map(                      # 3. transform each paragraph into an object
            split("n") 
            | map(scan("^([^:]+)(: (.*))?") | {key: first, value: last}) 
            | from_entries
        )
    ' data.file
    

    outputs

    [
      {
        "Cus Id": "1234",
        "Cus Name": "10:/John Parks",
        "Cus Type": "temporary",
        "Cus client": "tesla;toyota",
        "Dept Id": "111"
      },
      {
        "Cus Id": "1235",
        "Cus Name": "10:/William Parks",
        "Cus Type": "temporary",
        "Cus client": null,
        "Dept Id": "222"
      }
    ]
    
    Login or Signup to reply.
  2. A Python solution:

    import json
    
    def try_to_convert_to_int(val):
        try:
            val = int(val)
        except:
            pass
        return val
    
    data, group = [], {}
    with open('your_file.txt', 'r') as f_in:
        for line in map(str.strip, f_in):
            if line == "":
                if group:
                    data.append(group)
                group = {}
            else:
                k, v = map(str.strip, line.split(':', maxsplit=1))
                group[k] = try_to_convert_to_int(v) if v else None
    
    if group:
        data.append(group)
    
    print(json.dumps(data, indent=4))
    

    Prints:

    [
        {
            "Cus Id": 1234,
            "Cus Name": "10:/John Parks",
            "Cus Type": "temporary",
            "Cus client": "tesla;toyota",
            "Dept Id": 111
        },
        {
            "Cus Id": 1235,
            "Cus Name": "10:/William Parks",
            "Cus Type": "temporary",
            "Cus client": null,
            "Dept Id": 222
        }
    ]
    
    Login or Signup to reply.
  3. You could reduce the stream of raw-text (flag -R) inputs (flag -n) by iteratively building up the output array. Start with an array containing one empty object ([{}]), then capture each line’s contents using regular expressions, and populate the currently last array item with it. If capturing fails structurally (testing the presence of a key), add another empty object.

    jq -Rn 'reduce (inputs | capture("(?<k>[^:]+):\s*(?<v>.*)|")) as $in (
      [{}]; if $in.k then last[$in.k] = $in.v else . + [{}] end
    )'
    
    [
      {
        "Cus Id": "1234",
        "Cus Name": "10:/John Parks",
        "Cus Type": "temporary",
        "Cus client": "tesla;toyota",
        "Dept Id": "111"
      },
      {
        "Cus Id": "1235",
        "Cus Name": "10:/William Parks",
        "Cus Type": "temporary",
        "Cus client": "",
        "Dept Id": "222"
      }
    ]
    

    Demo

    Going further:

    • If you want to treat special values differently, adapt $in.v accordingly before the assignment. For example, to first test if values look like numbers and turn them into one, then test for empty strings and replace them with a special one ("null"), or else use the given string, you could go with something like ($in.v | tonumber? // (select(. == "") | "null") // .).
    • If you want to treat special blocks differently, process the output after the reduction. For example, to prevent the generation of empty objects that occur if input blocks are separated by more then one empty line, you could go with something like map(select(. != {})).
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search