Postgresql - Significance of IDs Starting with "N" in RDF Data Using RDFLib

KarthikJain
September 14, 2024
80 views
1 vote
2 Answers

I’m working with RDF data stored in a PostgreSQL database that was added using the RDFLib-SQLAlchemy library. While querying the asserted_statements table using SQL, I noticed that some objects and subjects have IDs that start with the letter "N".

Here’s a snippet of my SQL query and the results:

SELECT a.subject, a.predicate, a.object, b.subject, b.predicate, b.object
FROM public.kb_d5c47fc464_asserted_statements a
JOIN public.kb_d5c47fc464_asserted_statements b 
ON a.object = b.subject
WHERE a.object LIKE 'N%' AND b.subject LIKE 'N%' AND a.subject LIKE 'http%'
ORDER BY a.id ASC;

Sample Data Output:

subject	predicate	object
http://purl.obolibrary.org/obo/BFO_0000062	http://www.w3.org/2002/07/owl#propertyChainAxiom	N160ea22f83814f728990ceaafb6fbc43
http://purl.obolibrary.org/obo/BFO_0000062	http://www.w3.org/2002/07/owl#propertyChainAxiom	N1cb51000d673480fb7bff6975709ab97

I’m curious about the significance of these IDs starting with N. Are they generated by RDFLib or related to blank nodes? What role do they play in the RDF structure, and how should I interpret them?

Any insights into why these IDs are being used and their purpose would be helpful.

Tags: postgresql rdflib

Answers

- PeteKirkham
- September 13, 2024 at 12:58 pm
- 0 votes
0
The rest of the id looks like a UUID (version 4, variant 1) encoded as hex without dashes.

Looking through the source for UUID leads to BNode.__new__ which contains:
```
            value = _prefix + f"{node_id}"
```
where value becomes the node id and _prefix defaults to _unique_id() (source)

What is this unique id?

It is the letter ‘N’!
```
def _unique_id() -> str:
    # Used to read: """Create a (hopefully) unique prefix"""
    # now retained merely to leave internal API unchanged.
    # From BNode.__new__() below ...
    #
    # acceptable bnode value range for RDF/XML needs to be
    # something that can be serialzed as a nodeID for N3
    #
    # BNode identifiers must be valid NCNames" _:[A-Za-z][A-Za-z0-9]*
    # http://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#nodeID
    return "N"  # ensure that id starts with a letter
```
(source) (alternative source)
Login or Signup to reply.

- WhiteGobo
- September 14, 2024 at 9:14 am
- 0 votes
0
Those nodes starting with letter N are just blank nodes. See them as existential variables as you cant reuse them in further sparql queries. Although you can use them inside rdflib, eg:
```
#graph: rdflib.Graph
bnode_from_query = rdflib.BNode("N1234...")
triples_using_bnode_as_subject = graph.triples((bnode, None, None))
```
They are helpful, when you need to compare two graphs and you need some nodes without a name, see eg rdflib.compare.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Significance of IDs Starting with "N" in RDF Data Using RDFLib

Answers