skip to Main Content

I’m working with RDF data stored in a PostgreSQL database that was added using the RDFLib-SQLAlchemy library. While querying the asserted_statements table using SQL, I noticed that some objects and subjects have IDs that start with the letter "N".

Here’s a snippet of my SQL query and the results:

SELECT a.subject, a.predicate, a.object, b.subject, b.predicate, b.object
FROM public.kb_d5c47fc464_asserted_statements a
JOIN public.kb_d5c47fc464_asserted_statements b 
ON a.object = b.subject
WHERE a.object LIKE 'N%' AND b.subject LIKE 'N%' AND a.subject LIKE 'http%'
ORDER BY a.id ASC;

Sample Data Output:

I’m curious about the significance of these IDs starting with N. Are they generated by RDFLib or related to blank nodes? What role do they play in the RDF structure, and how should I interpret them?

Any insights into why these IDs are being used and their purpose would be helpful.

2

Answers


  1. The rest of the id looks like a UUID (version 4, variant 1) encoded as hex without dashes.

    Looking through the source for UUID leads to BNode.__new__ which contains:

                value = _prefix + f"{node_id}"
    

    where value becomes the node id and _prefix defaults to _unique_id() (source)

    What is this unique id?

    It is the letter ‘N’!

    def _unique_id() -> str:
        # Used to read: """Create a (hopefully) unique prefix"""
        # now retained merely to leave internal API unchanged.
        # From BNode.__new__() below ...
        #
        # acceptable bnode value range for RDF/XML needs to be
        # something that can be serialzed as a nodeID for N3
        #
        # BNode identifiers must be valid NCNames" _:[A-Za-z][A-Za-z0-9]*
        # http://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#nodeID
        return "N"  # ensure that id starts with a letter
    

    (source) (alternative source)

    Login or Signup to reply.
  2. Those nodes starting with letter N are just blank nodes. See them as existential variables as you cant reuse them in further sparql queries. Although you can use them inside rdflib, eg:

    #graph: rdflib.Graph
    bnode_from_query = rdflib.BNode("N1234...")
    triples_using_bnode_as_subject = graph.triples((bnode, None, None))
    

    They are helpful, when you need to compare two graphs and you need some nodes without a name, see eg rdflib.compare.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search