skip to Main Content

I have a text file with contents:

//input.txt
1) What is the first month of the year?
a) March
b) February
c) January
d) December
Answer: c) January

2) What is the last month of the year?
a) July
b) December
c) August
d) May
Answer: b) December

I would like to write a shell script that loops through this file input.txt (which has many more contents with the same format) and produce an output similar to the below JSON

[
 {
  "question": "What is the first month of the year?",
  "a": "March",
  "b": "February",
  "c": "January",
  "d": "December",
  "answer": "January",
 },
 {
  "question": "What is the last month of the year?",
  "a": "July",
  "b": "December",
  "c": "August",
  "d": "May",
  "answer": "December",
 },
[

I started by trying to write a bash script, to loop through the file and put each line separated by an empty line into curly brackets, and each item in the curly brackets into quotation marks, and separated by a comma, but it isn’t working

#!/bin/bash

output=""

while read line; do
  if [ -z "$line" ]; then
    output+="}n"
  else
    output+=""${line}","
    if [ $(echo "$output" | tail -n 1) == "" ]; then
      output+="{"
    fi
  fi
done < input.txt

output+="}"

echo "$output" > output.txt

3

Answers


  1. Here is one way using jq:

    jq -R -s '
    sub("n+$"; "") |
    split("nn") | map(
      split("n") | map(split(") ")) | [
        {question: .[0][1]},
        (.[1:-1][] | {(.[0]): .[1]}),
        {answer: .[-1][1]}
      ] | add
    )' input.txt
    

    Online demo

    Login or Signup to reply.
  2. You will pull all your hair out trying to generate proper JSON with Bash.

    First, your example JSON output is not proper JSON. The trailing , are not supported in arrays and mappings. So you example needs to be:

    [{
            "question": "What is the first month of the year?",
            "a": "March",
            "b": "February",
            "c": "January",
            "d": "December",
            "answer": "January"
        },
        {
            "question": "What is the last month of the year?",
            "a": "July",
            "b": "December",
            "c": "August",
            "d": "May",
            "answer": "December"
        }
    ]
    

    (Note no , after each "answer" or after the final }. You check for valid JSON with a tool or jsonlint)

    To generate that from your input, there are many JSON generator tools. The easiest for ME is Ruby:

    ruby -00 -r json -ne '
    BEGIN{out=[]}
    sub(/Ad+)s+/,"question)")
    sub(/Answer: [a-z]/,"answer")
    out << $_.split(/R/).map{|l| l.split(/[):]s*/,2)}.to_h
    END{puts JSON.pretty_generate(out)}' file 
    

    Prints:

    [
      {
        "question": "What is the first month of the year?",
        "a": "March",
        "b": "February",
        "c": "January",
        "d": "December",
        "answer": "January"
      },
      {
        "question": "What is the last month of the year?",
        "a": "July",
        "b": "December",
        "c": "August",
        "d": "May",
        "answer": "December"
      }
    ]
    
    Login or Signup to reply.
  3. Two different approaches (with the same result) with the JSON-parser :

    $ xidel -s input.txt -e '
      array{
        for $x in tokenize($raw,"nn")
        let $a:=tokenize($x,"n")
        return
        map:merge((
          {"question":substring-after($a[1],") ")},
          $a[position() = 2 to 5] ! {substring-before(.,")"):substring-after(.,") ")},
          {"answer":substring-after($a[6],") ")}
        ))
      }
    '
    
    $ xidel -s input.txt -e '
      array{
        for $x in tokenize($raw,"nn") return
        map:merge(
          for $y at $i in tokenize($x,"n") return {
            if ($i eq 1) then "question"
            else if ($i eq 6) then "answer"
            else substring-before($y,")"):
            substring-after($y,") ")
          }
        )
      }
    '
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search