skip to Main Content

As I was working and seeing through the apache age viewer. This question regarding a data analysis project has come to my mind like how to use the apache age viewer to import this data in CSV or JSON format. What’s the best method so that I can start analyzing it.

I myself researched for the sources so I can get an accurate answer but unfortunately coudnt find any good explanation. Some are recommending Gremlin and Cyoher to load it while some are saying to converrt into TinkerPop first.

I hope to get a thorough explanation on how to import the data using either of the method. Looking forward to it.

4

Answers


  1. If you want to import from CSV file, you can refer to this. Make sure to preprocess your file, so that the columns and headings have same format.

    You can also use "COPY" statement to achieve this. It can be done as:

    COPY [YOUR_TABLE_NAME] FROM [PATH_TO_CSV-FILE] WITH (FORMAT csv);
    

    The documentation for COPY is here.
    There are other ways as well like using TinkerPop & Cypher, each with own benefits,

    Login or Signup to reply.
  2. Yes, you can import data from CSV files into Postgres using Apache AGE. The function load_labels_from_file is used to load vertices from the CSV files. Sample syntax:

    load_labels_from_file('<graph name>','<label name>','<file path>')
    

    For Example: Create label country and load vertices from csv file.

    SELECT create_graph('agload_test_graph');
    SELECT create_vlabel('agload_test_graph', 'Country');
    SELECT load_labels_from_file('agload_test_graph', 'Country','age_load/countries.csv');
    

    For more details you can follow this: Importing graph from file

    But don’t forget to preprocess your files so that the columns and headings are in the correct format.

    Login or Signup to reply.
  3. For making labels in the Age, you can use the following command.

    load_labels_from_file('<graph name>', 
                      '<label name>',
                      '<file path>')
    
    The fourth parameter is optional and only used if we are not giving the ID in the csv file.
    
    load_labels_from_file('<graph name>', 
                      '<label name>',
                      '<file path>', 
                      false)
    

    Format of CSV File for labels:

    ID: It shall be the first column of the file and all values shall be a positive integer. This is an optional field when id_field_exists is false. However, it should be present when id_field_exists is not set to false.

    Properties: All other columns contains the properties for the nodes. The header row shall contain the name of the property

    For adding edges, the following function is used.

    oad_edges_from_file('<graph name>',
                    '<label name>',
                    '<file path>');
    

    Format of the CSV File for edges is as follows:

    start_id: node id of the node from where the edge is stated. This id shall be present in the nodes.csv file.

    start_vertex_type: class of the node.

    end_id: end id of the node at which the edge shall be terminated.

    end_vertex_type: Class of the node.

    properties: properties of the edge. the header shall contain the property name.

    For a detailed explanation and example, you can visit this link.

    CSV LINK

    Login or Signup to reply.
  4. In Apache AGE,

    A CSV file containing nodes’ data should be formatted as following:

    id:

    It should be the first column of the file and all values should be a positive integer. This is an optional field when id_field_exists is false. However, it should be present when id_field_exists is not set to false.

    Properties:

    All other columns contains the properties for the nodes. Header row shall contain the name of property

    Create Vertex Lable:

    SELECT create_vlabel('GraphName','LabelName');
    

    Load Data from CSV:

    SELECT load_labels_from_file('GraphName',
                                 'LabelName',
                                 'Path/to/file.csv');
    

    Similarly, In Apache AGE a CSV file for edges should be formatted as follows:

    start_id

    node id of the node from where the edge is stated. This id shall be present in nodes.csv file.

    start_vertex_type

    It should contain class/ label of the node.

    end_id

    The end id of the node at which the edge shall be terminated. This id should also be present in nodes.csv file.

    end_vertex_type

    It should contain class/ label of the node.

    properties

    The properties of the edge. The header (1st Row ) shall contain the property name. 2nd Row and onward rows contains data (values).

    Create Edge Label:

    SELECT create_elabel('GraphName','EdgeLabelName');
    

    Load Edge Data from csv File:

    SELECT load_edges_from_file('GraphName', 'EdgeLabelName',
         'Path/to/file.csv');
    

    For Bulk labels Loading, you can also import labels from csv file:

    load_labels_from_file('<graph name>', 
                      '<label name>',
                      '<file path>')
    

    Or you can use this:

    load_labels_from_file('<graph name>', 
                      '<label name>',
                      '<file path>', 
                      false)
    

    Here:
    The fourth parameter is optional and only used if we are not giving the ID in the labels’ csv file.
    For more details you can also study this Answer: https://stackoverflow.com/a/76022161/20972645

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search