skip to Main Content

I have a csv file that is not formatted in the correct way for AGE to load. I was on the task to transform it into a new one so that AGE could read it and create nodes, like it is specified in the documentation. For that, I created a python script that creates a new file, connects to postgres, and performs the queries. I though this could be useful since if someone had csv files and wanted to create nodes and edges and send it to AGE, but it was not in the specified format, this could be used to quickly solve the problem.

Here is the old csv file (ProductsData.csv), it contains the data of products that have been purchased by other users (identified by their user_id), the store where the product was purchased from (identified by their store_id), and also the product_id, which is the id of the node:

product_name,price,description,store_id,user_id,product_id
iPhone 12,999,"Apple iPhone 12 - 64GB, Space Gray",1234,1001,123
Samsung Galaxy S21,899,"Samsung Galaxy S21 - 128GB, Phantom Black",5678,1002,124
AirPods Pro,249,"Apple AirPods Pro with Active Noise Cancellation",1234,1003,125
Sony PlayStation 5,499,"Sony PlayStation 5 Gaming Console, 1TB",9012,1004,126

Here is the Python file:

import psycopg2
import age
import csv

def read_csv(csv_file):

    with open(csv_file, 'r') as file:
        reader = csv.reader(file)
        rows = list(reader)

    return rows


def create_csv(csv_file):

    new_header = ['id', 'product_name', 'description', 'price', 'store_id', 'user_id']
    property_order = [5, 0, 2, 1, 3, 4]  # Reorder the properties accordingly.
    
    rows = read_csv(csv_file)
    
    new_csv_file = 'products.csv'
    with open(new_csv_file, 'w', newline='') as file:
        writer = csv.writer(file)
        
        writer.writerow(new_header)
        
        # Write each row with reordered properties.
        for row in rows[1:]:
            new_row = [row[i] for i in property_order]
            writer.writerow(new_row)

    print(f"New CSV file '{new_csv_file}' has been created with the desired format.")


def load_csv_nodes(csv_file, graph_name, conn):
    
    with conn.cursor() as cursor:
        try :
            cursor.execute("""LOAD 'age';""")
            cursor.execute("""SET search_path = ag_catalog, "$user", public;""")
            cursor.execute("""SELECT load_labels_from_file(%s, 'Node', %s)""", (graph_name, csv_file,) )
            conn.commit()
        
        except Exception as ex:
            print(type(ex), ex)
            conn.rollback()



def main():

    csv_file = 'ProductsData.csv'
    create_csv(csv_file)

    new_csv_file = 'products.csv'
    GRAPH_NAME = 'csv_test_graph'
    conn = psycopg2.connect(host="localhost", port="5432", dbname="database", user="user", password="password")
    age.setUpAge(conn, GRAPH_NAME)

    path_to_csv = '/path/to/folder/' + new_csv_file
    load_csv_nodes(path_to_csv, GRAPH_NAME, conn)

main()

The generated file:

id,product_name,description,price,store_id,user_id
123,iPhone 12,"Apple iPhone 12 - 64GB, Space Gray",999,1234,1001
124,Samsung Galaxy S21,"Samsung Galaxy S21 - 128GB, Phantom Black",899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,"Sony PlayStation 5 Gaming Console, 1TB",499,9012,1004

But then, when running the script, it shows the following message:

<class 'psycopg2.errors.InvalidParameterValue'> label_id must be 1 .. 65535

The ids are set between 1 and 65535, and I don’t understand why this error message is showing.

2

Answers


  1. For how to use load_labels_from_file please refer to the regress testing file. It shows how to use all the commands.

    You first need to create Node vlabel before calling load_labels_from_file using the following command:

    SELECT create_vlabel('csv_test_graph','Node');
    

    Then run the script as it is.

    Login or Signup to reply.
  2. That’s line is not properly written, you need to fix it with the correct path

        path_to_csv = '/path/to/folder/' + new_csv_file
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search