I’m trying to extract the list of all unique domains from the Disconnect.json file under all categories. Here’s the link to the json file I’m using. https://github.com/disconnectme/disconnect-tracking-protection/blob/master/services.json
And here’s my python code for it:
import json
def get_disconnect_domains(disconnect_json_file):
disconnect_domains = set()
with open(disconnect_json_file, 'r') as f:
disconnect_json = json.load(f)
for category in disconnect_json['categories']:
for tracker in category['trackers']:
disconnect_domains.add(tracker['domain'])
return disconnect_domains
disconnect_domains = get_disconnect_domains('disconnect.json')
(I have the file saved as disconnect.json) But I keep running into this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-24-1e2723fa469d> in <cell line: 14>()
12
13 # Get the list of all unique domains from the disconnect.json file
---> 14 disconnect_domains = get_disconnect_domains('services.json')
15
16 # Print the list of all unique domains
<ipython-input-24-1e2723fa469d> in get_disconnect_domains(disconnect_json_file)
6 disconnect_json = json.load(f)
7 for category in disconnect_json['categories']:
----> 8 for tracker in category['trackers']:
9 disconnect_domains.add(tracker['domain'])
10
TypeError: string indices must be integers
I don’t usually work in python so I’m not sure what exactly is going wrong. Can someone help please?
2
Answers
The error message you are encountering, "
TypeError: string indices must be integers
," suggests that you are trying to access a string using a string index as if it were a dictionary, which is not allowed in Python.The issue is most likely related to the structure of the JSON file you are trying to parse. To access elements in a JSON file, you should use dictionary-style indexing (with keys) when dealing with objects, and list-style indexing (with integers) when dealing with arrays.
In your case, the error occurs because you are treating a string as if it were a dictionary. To fix this issue, you need to navigate through the JSON structure correctly based on the object types within the JSON.
Here’s an updated version of your code to correctly navigate the JSON structure:
This code assumes that the JSON file contains an array of objects with a ‘trackers’ key, and each tracker object has a ‘domain’ key. It navigates the JSON structure accordingly and extracts the unique domains.
Make sure that the JSON structure of the ‘services.json’ file matches the assumptions made in this code. If the structure of the JSON is different, you’ll need to adjust the code accordingly.
The domains are fairly deeply nested in your JSON structure. You can access them like this: