I have a spreadsheet like the following:
As you can see, there are multiple "tags" columns like this: "tags_0", "tags_1", "tags_2".
And they can be more.
I’m trying to find all the "tags", and put them inside a list using panda’s data frame. And eventually, put them inside an array of "tags" inside a json file.
I thought of using regex, but I can’t find a way to apply it.
This is the function I’m using to output the json file. I added the tags array for reference:
def convert_products():
read_exc = pd.read_excel('./data/products.xlsx')
df = pd.DataFrame(read_exc)
all_data = []
for i in range(len(df)):
js = {
"sku": df['sku'][i],
"brand": df['brand'][i],
"tags": [?]
}
all_data.append(js)
json_object = json.dumps(all_data, ensure_ascii=False, indent=2)
with open("./data/products.json", "w", encoding='utf-8') as outfile:
outfile.write(json_object)
How can I achieve this?
Thanks
3
Answers
First You can get all the columns as a list
Now you can search for all columns names which contains tags_ inside this list, once you get all the columns names which is for tags, you can loop through this list and retrieve specific tag value for specific row and put inside a list
And can pass into json object.
You can achieve that in a much easier way by doing something like this…
You can try other json orientation if you want:
["index","columns","split","records","values","table"]
. Check them in pandas documentationYou are probably looking for
filter
: