Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Summing values in json objects in Python

sheharbano
February 8, 2023
241 views
1 vote
2 Answers

I have two JSON objects. I want to merge them but wherever the keys are the same the field obj_count should be summed. Is there any way around it in python?

Here is an example of it:
This is the 1st JSON object

[
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
]

And here is the second json object

[
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 97},
    {"text": " ink", "id": "AAT15012 ", "obj_count": 297}
]

What I want is something like this:

[
   {"text":" pen and ink and watercolour","id":"x32505 ","obj_count": 2662 #summed},
   {"text":" watercolour","id":"x33202 ","obj_count": 771 #summed},
   {"text":" ink","id":"AAT15012 ","obj_count":297},
   {"text":"pencil","id":"AAT16013 ","obj_count":297}
]

Answers

Yes

Any loading/saving can be done with the json module (not used below though)

def sum_list_of_dict(source, add):
    for add_elem in add:
        found = False
        for source_elem in source:
            if add_elem["id"] == source_elem["id"]:
                source_elem["obj_count"] += add_elem["obj_count"]
                found = True
                break  # dupes should not be present
        if not found:
            source.append(add_elem)
    return source


data1 = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 1855},
    {"text": "watercolour", "id": "x33202", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013", "obj_count": 297},
]

data2 = [
    {"text": "pen and ink and watercolour", "id": "x32505", "obj_count": 807},
    {"text": "watercolour", "id": "x33202", "obj_count": 97},
    {"text": "ink", "id": "AAT15012", "obj_count": 297},
]

data3 = sum_list_of_dict(data1, data2)

# just for pretty printing
from pprint import pprint
pprint(data3)

output

[{'id': 'x32505', 'obj_count': 2662, 'text': 'pen and ink and watercolour'},
 {'id': 'x33202', 'obj_count': 771, 'text': 'watercolour'},
 {'id': 'AAT16013', 'obj_count': 297, 'text': 'pencil'},
 {'id': 'AAT15012', 'obj_count': 297, 'text': 'ink'}]

Use a dict to store whether you have seen an id or not

if you have, sum their obj_count
if you haven’t, just save the item

values_a = [
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 1855},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 674},
    {"text": "pencil", "id": "AAT16013 ", "obj_count": 297}
]

values_b = [
    {"text": " pen and ink and watercolour", "id": "x32505 ", "obj_count": 807},
    {"text": " watercolour", "id": "x33202 ", "obj_count": 97},
    {"text": " ink", "id": "AAT15012 ", "obj_count": 297}
]

result = {}
for item in [*values_a, *values_b]:
    if item['id'] in result:
        result[item['id']]['obj_count'] += item['obj_count']
    else:
        result[item['id']] = item

# back to list of items
result = list(result.values())

Please signup or login to give your own answer.

Click here to cancel reply.