I was thinking on developing a function for age that returns a graph’s adjacency matrix. An adjacency matrix stores the number of edges between two nodes. The complexity to check if an edge exists within an adjacency matrix is O(1), while the space complexity is O(v^2) with v = number of vertices. AGE stores the vertices of each graph in graph_name._ag_label_vertex
, while the edges are stored in graph_name._ag_label_edge
.
The vertex table contains two columns: id and properties. The id column is of integer type, while the properties column is a JSON like format. The edge table contains the id of the edge, the id of the vertex which it comes from, the id of the vertex that it goes to, and the edge’s properties.
Example:
SELECT * FROM "graph_name"."Vertex";
id | properties
------------------+-------------
844424930131971 | {}
1407374883553281 | {}
SELECT * FROM "graph_name"."Edges";
id | start_id | end_id | properties
------------------+-----------------+------------------+------------
1125899906842625 | 844424930131971 | 1407374883553281 | {}
(1 row)
I was thinking on traversing the vertex column first, so that it is possible to store the vertices ids initially:
[0, 1, 2, 3]
[1, 0, 0, 0]
[2, 0, 0, 0]
[3, 0, 0, 0]
After that, loop through the edges table and store 1 if the edge between two nodes exists (if there are multiple edges between a pair of nodes, add 1 to the already set value for each edge). But I don’t know exactly how to do this in C for a postgres function. Which function should I call to traverse the tables and get the values from the id column? I though on using SPI twice, first for the vertices and the second time for the edges, but I don’t know if this is the best approach.
In the age–1.3.0.sql file, I declared the function as
CREATE FUNCTION ag_catalog.get_adjacency_matrix(graph_name name)
RETURNS integer[][]
LANGUAGE c
AS 'MODULE_PATHNAME';
And it’s definition in src/backend/commands/graph_commands.c
PG_FUNCTION_INFO_V1(get_adjacency_matrix);
/*
* Function for returning the graph's adjacency matrix. This adjacency matrix
* stores the number of edges between two nodes. The complexity to check if an
* edge exists within an adjacency matrix is O(1), while the space complexity
* is O(v^2) with v = number of vertices.
*/
Datum get_adjacency_matrix(PG_FUNCTION_ARGS)
{
PG_RETURN_VOID();
}
2
Answers
Apparently, the best way to traverse a table utilizing a C function for PostgreSQL is with the Server Programming Interface (SPI). Here's a code example:
From PostgreSQL's Documentation:
When we call
SPI_execute(const char * command, bool read_only, long count)
, it will execute the specified SQL command forcount
rows, and, ifread_only
is true, the command must be read only. Ifcount
is zero then the command is executed for all rows. This function returns a value depending on what query was executed. In this case, it returnsSPI_OK_SELECT
if the execution was successful.The actual number of rows for which the (last) command was executed is returned in the global variable
SPI_processed
. If the return value of the function is indeedSPI_OK_SELECT
, then we can use the global pointerSPITupleTable *SPI_tuptable
to access the result rows.The
SPITupleTable
struct is declared as:We can access the corresponding row from the
SPITupleTable
by doingSPI_tuptable->vals[i]
wherei
is the number of the row. To get the value of a certain column from the row, in this case, we need to callSPI_getvalue(HeapTuple row, TupleDesc rowdesc, int colnumber)
. This function will return achar*
to the corresponding data.If we want to convert it to
int64
, we callDatumGetInt64(DirectFunctionCall1(int8in, CStringGetDatum(result_string)))
.That’s not an answer but it can be considered as a guide from another function:
By looking at
get_all_edge_labels_per_graph
At some point we can figure out fetching tuples inside it as the following using those functions:
Function:
References: