Existing Code
import json
filename = 'thunar-volman/debian/control'
dict1 = {}
with open(filename) as fh:
for line in fh:
print(line)
command, description = line.strip().split(': ')
print(command)
print(description)
dict1[command.strip()] = description.strip()
with open("test.json", "w") as out_file:
json.dump(dict1, out_file, indent=4, sort_keys = False)
Error
Build-Depends
debhelper-compat (= 13),
intltool,
Traceback (most recent call last):
File "read.py", line 7, in <module>
command, description = line.strip().split(': ')
ValueError: not enough values to unpack (expected 2, got 1)
The text file I am intending to process to json is here – https://salsa.debian.org/xfce-team/goodies/thunar-volman/-/blob/debian/master/debian/control
How can I process the content such that the content behind the colon of Build-Depends
would be processed as the description
for the Build-Depends
command
.
Any help would be very much appreciated as I am very new to json.
6
Answers
Your file i’ts yaml
For work with yaml files you need library ruamel.yaml
Install
Load file and convert to json and write json
You can do this:
So basically if there is a white space in the start then it is going to append in the previous command, the new item.
use
maxsplit
of Python String split() Method.output:
Open the file and read it line by line. Ignore blank lines. Split on colon checking number of tokens. Ensure sanity of the input data
Output:
Your problem is very simple, I wrote working code for it in under five minutes, in one go.
You have some lines representing a mapping, the lines can contain colons and a line that contains a colon indicates the start of a new key value pair.
The key is on the left side of the colon and the value can span multiple lines.
We can assign a variable named
key
, set it to initiallyNone
. We then loop through the lines, for each line, if we found a colon and the first character is not space, we have found a new key value pair.We add the previous key value pair to the dictionary if key is not
None
. We then set the current key value pair to be remembered, and use them in later iterations.And then if the line is not empty and not the start of a new pair, it is the continuation of the previous value, we add it to the previous value.
In this way we can process all items correctly, but we will miss the last item.
We can add it later.
Code:
You can use
re
(text
contains the string from your question) (regex101):Prints: