I’ve tried using both Python and Postgres to convert a table of data into JSON that can be used to render a nested dropdown in a web UI.
Here is how the data is structured:
group_desc | category_desc | subcategory_desc | item_desc |
---|---|---|---|
Group C | Category C | Subcategory A | Item H |
Group C | Category C | Subcategory A | Item I |
Group C | Category C | Subcategory A | Item J |
Group C | Category C | Subcategory B | Item K |
Group C | Category C | Subcategory B | Item L |
Group D | Category D | Subcategory C | Item M |
Group D | Category D | Subcategory C | Item N |
Here is how I’d like the JSON to appear:
{
'group': [{
group_desc: 'Group C',
'categories': [{
'category_desc': 'Category C',
'subcategories': [{
'subcategory_desc': 'Subcategory A',
'items': [
'item_desc': 'Item H',
'item_desc': 'Item I',
'item_desc': 'Item J',
]
},
{
'subcategory_desc': 'Subcategory B',
'items': [
'item_desc': 'Item K',
'item_desc': 'Item L',
]
}]
}]
},
{
'group_desc': 'Group D',
'categories': [{
'category_desc': 'Category D',
'subcategories': [{
'subcategory_desc': 'Subcategory D',
'items': [
'item_desc': 'Item M',
'item_desc': 'Item N',
]
}]
}]
}]
}
In essence, for each group of records, translate it to an array of objects (or a list of dictionaries). Thank you!
I’ve tried using DataFrameGroupBy with Python. I’ve also tried json_agg and json_build_object using Postgres. The closest that I was able to get is only a list of dictionaries without the categories and subcategories in a list.
Here is the code in Python using a DataFrame structured like the table above:
import pandas as pd
## PREPARE DATAFRAME
item_dict = {
'group_desc': ['Group C', 'Group C', 'Group C', 'Group C', 'Group C', 'Group D', 'Group D'],
'category_desc': ['Category C', 'Category C', 'Category C', 'Category C', 'Category C', 'Category D', 'Category D'],
'subcategory_desc': ['Subcategory A', 'Subcategory A', 'Subcategory A', 'Subcategory B', 'Subcategory B', 'Subcategory C', 'Subcategory C'],
'item_desc': ['Item H', 'Item I', 'Item J', 'Item K', 'Item L', 'Item M', 'Item N']
}
hierarchy_df = pd.DataFrame(item_dict)
hierarchy_df.head()
## TRY TO CREATE THE JSON
records = []
group_list = []
category_list = []
subcategory_list = []
prev_group = ''
prev_category = ''
prev_subcategory = ''
for ix, row in hierarchy_df.iterrows():
# initialize dict
entry = {}
# pull item group
group_desc = row['group_desc']
category_desc = row['category_desc']
subcategory_desc = row['subcategory_desc']
desc = row['item_desc']
# reset list
if((prev_subcategory != subcategory_desc) | (prev_category != category_desc) | (prev_group != group_desc)):
group_list = []
category_list = []
subcategory_list = []
# check for category
if(category_desc) is not None:
# check for subcategory
if(subcategory_desc) is not None:
subcategory_list.append(desc)
entry['group'] = {
'group_desc': group_desc,
'category': {
'category_desc': category_desc,
'subcategory': {
'subscategory_desc': subcategory_desc,
'item': {
'item_desc': subcategory_list
}
}
}
}
# if no subcategory
else:
category_list.append(desc)
entry['group'] = {
'group_desc': group_desc,
'category': {
'category_desc': category_desc,
'item': {
'item_desc': category_list
}
}
}
# if no category
else:
group_list.append(desc)
entry['group'] = {
'group_desc': group_desc,
'item': {
'desc': group_list
}
}
records.append(entry)
prev_group = row['group_desc']
prev_category = row['category_desc']
prev_subcategory = row['subcategory_desc']
# dedupe (temp workaround)
records = [i for n, i in enumerate(records)
if i not in records[n + 1:]]
# preview result
records
2
Answers
Thanks to @jqurious, I was able to come up with this solution. This helped nest all of the related data into the desired JSON format.
Thanks everyone!
There are probably better ways of doing this but as for your current approach, this may be helpful.
Instead of trying to create the final output in one pass, you can first create a nested dict using the values as the keys, and only create a list at the last level:
You can then add the headers/lists back in to the higher levels:
Output:
Thank you! Here is the final code I used.
Dart’s edit to the comment. Your idea was very helpful! I wound up using a nested loop based on your logic! Here is the code: