Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Can neo4j create the map automatically from the json file if the relationships are defined in the json file?

WahajAhmad
June 22, 2023
218 views
0 votes
2 Answers

I have a json file that defines the nodes and their relationships. It looks sometihng like this:

{"p":{"type":"node","id":"0","labels":["Paintings"],"properties":{"date":"1659-01-01T00:00:00","img":"removed-for-brevity(RFB)","name":"King Caspar","sitelink":"1","description":"RFB","exhibit":"RAB","uri":"RFB"}},"r":{"id":"144","type":"relationship","label":"on_MATERIAL","start":{"id":"0","labels":["Paintings"]},"end":{"id":"2504","labels":["Material"]}},"n":{"type":"node","id":"2504","labels":["Material"],"properties":{"name":"oak","sitelink":5,"description":"RFB","uri":"RFB"}}}

"p" is the first node, "r" is the relationship, "n" is the second node.

Is it possible for neo4j to create a graph/map automatically from this json file, without having to define the nodes and relationships through cypher manually?

I am fairly new to neo4j, I tried following the examples given on the Load JSON page, but it defines the nodes and their relationships manually, which i want to avoid.

Answers

Chosen as BEST ANSWER

It looks like neo4j can't automatically create a graph data model using a json file (as @cybersam pointed out earlier).

I ended up writing a Python script to do this for me. Posting this here just in case it helps someone. It does the job for me!

from neo4j import GraphDatabase
import json

# Connect to Neo4j
uri = "bolt://localhost:7687"
username = "_username_"
password = "_password_"

driver = GraphDatabase.driver(uri, auth=(username, password))

processed_painting_ids = set() #mainting a set to track unique painting node IDs
processed_node_ids = set()

# Load JSON data from file
with open("data_json.json", "r") as file:
    for line in file:
        json_data = json.loads(line)

        p_data = json_data["p"]
        r_data = json_data["r"]
        n_data = json_data["n"]

        p_unique_id = p_data.get("id") #keeps track of the id of the "p" node. 

        # Handle missing values in the data
        p_id = str(p_data["id"])
        p_date = str(p_data["properties"].get("date", "Unknown date"))
        p_img = p_data["properties"].get("img", "Unknown img")
        p_name = p_data["properties"].get("name", "Unknown name")
        p_sitelink = str(p_data["properties"].get("sitelink", "Unknown sitelink"))
        p_description = p_data["properties"].get("description", "Unknown description")
        p_exhibit = p_data["properties"].get("exhibit", "Unknown exhibit")
        p_uri = str(p_data["properties"].get("uri", "Unknown uri"))

        r_id = str(r_data["id"])
        r_label = r_data["label"]
        start_id = str(r_data["start"]["id"])
        end_id = str(r_data["end"]["id"])

        n_id = str(n_data["id"])
        n_name = n_data["properties"].get("name", "Unknown name")
        n_sitelink = str(n_data["properties"].get("sitelink","Unknown sitelink"))
        n_description = n_data["properties"].get("description","Unknown description")
        n_uri = n_data["properties"].get("uri","Unknown uri")

        with driver.session() as session:
    
            # Create the "n" material node
            if n_id not in processed_node_ids:
                session.run("CREATE (n:" + n_data["labels"][0] + " {id: " + n_id + ", name: "" + n_name + "", sitelink: "" + n_sitelink + "", description: "" + n_description + "", uri: "" + uri + ""})")
                processed_node_ids.add(n_id)
            # check if the "p" node is repititive
            if p_unique_id not in processed_painting_ids:
                # Create the "p" node
                session.run("CREATE (p:" + p_data["labels"][0] + "{id: "+p_id+",date: ""+p_date+"", img: ""+p_img+"", name: ""+p_name+"", sitelink: " + p_sitelink+", description: ""+p_description+"", exhibit: ""+p_exhibit+"", uri: ""+p_uri + ""})") 
                # Add id of the node to the set
                processed_painting_ids.add(p_unique_id)
            # Create the "r" relationship
            session.run("MATCH (start), (end) WHERE start.id = "+start_id+" AND end.id = "+end_id+" CREATE (start)-[r:"+r_label+" {id: "+r_id+"}]->(end)")

(Edit)

- cybersam
- June 21, 2023 at 9:33 pm
- 0 votes
0
No, there is no automated way, and even if there were the generated result could be suboptimal or even wrong for your use cases.

You need to design the graph data model (node labels, relationship types, etc.) yourself. There are many considerations (like your use cases, and the necessary indexes and constraints) that are not revealed by a simple JSON data dump. Also, you need to understand the schema of the JSON and determine how to map that to your data model.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.