skip to Main Content

I have a JSON file that is scrape from the crunchbase and wanted to print ‘contact_field’ from the file

this is sample of the contact field in JSON file look like.

"contact_fields": {
              "contact_email": "[email protected]",
              "phone_number": "+44 800 011 9688"
            },

the code below is To find the contact field to print the contact info and the phone number

with open('html_found.json') as filter:
    data = json.load(filter)

# Find and print the specified content if it exists
if "contact_fields" in data:
    contact_fields = data["contact_fields"]
    contact_email = contact_fields.get("contact_email")
    phone_number = contact_fields.get("phone_number")

    if contact_email:
        print("Contact Email:", contact_email)
    else:
        print("Contact Email not found.")

    if phone_number:
        print("Phone Number:", phone_number)
    else:
        print("Phone Number not found.")
else:
    print("No contact fields found in the JSON data.")

The output says "No contact fields found in the json" even tho its in the file. Did i make any mistake in the file or on the line

Below is the lines from html_found.json using the json beautify

enter image description here

2

Answers


  1. The code in general works. I added a small write before that writes exactly what you have posted into a json and then reads and parses it. Are you sure the contact_fields is not itself inside a list or another dict, so it can be accessed directly via data["contract_fields"]? Please provide a full json which reproduces your problem, with the part you have posted nothing is wrong at the moment.

    import json
    
    with open('tmp.json', 'w') as f:
        f.write('{"contact_fields": {"contact_email": "[email protected]","phone_number": "+44 800 011 9688"}}')
    
    with open('tmp.json', 'r') as f:
        data = json.load(f)
    
    # Find and print the specified content if it exists
    if "contact_fields" in data:
        contact_fields = data["contact_fields"]
        contact_email = contact_fields.get("contact_email")
        phone_number = contact_fields.get("phone_number")
    
        if contact_email:
            print("Contact Email:", contact_email)
        else:
            print("Contact Email not found.")
    
        if phone_number:
            print("Phone Number:", phone_number)
        else:
            print("Phone Number not found.")
    else:
        print("No contact fields found in the JSON data.")
    

    My output

    Contact Email: [email protected]
    Phone Number: +44 800 011 9688
    

    EDIT:
    Based on the newly added screenshot of the json it is clear that contact_fields is not at the root level of the json. I am going to assume what is posted is now the json which you want to parse.

    So the way to get to contact_fields is something like

    data['HttpState']['GET/v4/data/...']['data']['cards']['contact_fields']
    

    Note that I did not type the whole key for GET/v4/data/... it is quite long, so you need to replace it by the full string.

    Also no idea how static this whole json is, especially the key under HttpState seems like it might change. So hard coding the access might break.

    Login or Signup to reply.
  2. The json module converts a Json file or string into a hierachy of lists and dicts. But the in Python operator only research for a key in a simple dict. More generally, Python offers no direct way to recursively search into a complex hierarchy, but it is rather simple do write a function for it:

    def search(data, key: str):
        """Recursively search key in an arbitrary json data"""
        def _do_search(data, key):
            """Apply search over an iterable sequence"""
            for d in data:
                newd = search(d, key)
                if newd is not None:
                    return newd
            return None
        # is it a list?
        if isinstance(data, list):
            return _do_search(data, key)
        # or a dict?
        elif isinstance(data, dict):
            if key in data:  # we found the key: just return it
                return data[key]
            return _do_search(data.values(), key) # else recurse
        return None
    

    Your code could become

    with open('html_found.json') as filter:
        data = json.load(filter)
    
        # Find and print the specified content if it exists
        contact_fields = search(data, "contact_fields")
        if "contact_fields" is not None:
            contact_email = contact_fields.get("contact_email")
            phone_number = contact_fields.get("phone_number")
        
            if contact_email:
                print("Contact Email:", contact_email)
            else:
                print("Contact Email not found.")
        
            if phone_number:
                print("Phone Number:", phone_number)
            else:
                print("Phone Number not found.")
        else:
            print("No contact fields found in the JSON data.")
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search