In RDF, a blank node is a node in an RDF graph representing a resource for which a URI or literal is not given. The resource represented by a blank node is also called an anonymous resource. According to the RDF standard a blank node can only be used as subject or object of an RDF triple.
Notation in serialization formats
Blank nodes can be denoted through blank node identifiers in the following formats, RDF/XML, RDFa, Turtle, N3 and N-Triples. The following example shows how it works in RDF/XML. xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ex="http://example.org/data#">
The blank node identifiers are only limited in scope to a serialization of a particular RDF graph, i.e. the node _:b in the subsequent example does not represent the same node as a node named _:b in any other graph. Blank nodes can also be denoted through nested elements. Here is the same triples with the above. xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ex="http://example.org/data#">
Below is the same example in Turtle. @prefix ex: . ex:title "Web Data" ; ex:professor .
Usability
Blank nodes are treated as simply indicating the existence of a thing, without using a URI to identify any particular thing. This is not the same as assuming that the blank node indicates an 'unknown' URI.
Anonymous resources in RDF
From a technical perspective they give the capability to:
describe multi-component structures, like the RDF containers,
describe reification,
represent complex attributes without having to name explicitly the auxiliary node and
offer protection of the inner information.
Below there is an example where blank nodes are used to represent resources in the aforementioned ways. In particular, the blank node with the identifier '_:students' represents a Bag RDF Container, the blank node with the identifier '_:address' represents a complex attribute and those with the identifiers '_:activity1' and '_:activity2' represent events in the lifecycle of a digital object. ex:title "Web Data" ; ex:professor _:entity ; ex:students _:students ; ex:generatedBy _:activity1. _:entity ex:fullName "Alice Carol" ; ex:homePage ; ex:hasAddress _:address. _:address a ex:Address ; ex:streetAddress "123 Main St." ; ex:postalCode "A1A1A1" ; ex:addressLocality "London". _:students a rdf:Bag ; ex:hasMember _:s1 ; ex:hasMember _:s2. _:activity1 a ex:Event; ex:creator _:entity ; ex:atTime "Tuesday 11 February, 06:51:00 CST". _:activity2 a ex:Event, ex:Update ; ex:actionOver _:activity1 ; ex:creator _:entity2 ; ex:atTime "Monday 17 February, 08:12:00 CST".
The ontology language OWL uses blank nodes to represent anonymous classes such as unions or intersections of classes, or classes called restrictions, defined by a constraint on a property. For example, to express that a person has at most one birth date, one will define the class "Person" as a subclass of an anonymous class of type "owl:Restriction". This anonymous class is defined by two attributes specifying the constrained property and the constraint itself 1
According to an empirical survey in Linked Data published on the Web, out of the 783 domains contributing to the corpus, 345 did not publish any blank nodes. The average percentage of unique terms which were blank nodes for each domain was 7.5%, indicating that although a small number of high-volume domains publish many blank nodes, many other domains publish blank nodes more infrequently.
From the 286.3 MB unique terms found in data-level positions the 165.4 MB were blank nodes, 92.1 MB were URIs, and 28.9 MB were literals. Each blank node had on average 5.2 data-level occurrences. It occurred, on average, 0.99 times in the object position of a non-rdf:type triple, and 4.2 times in the subject position of a triple.
Structure of blank nodes
According to the same empirical survey of linked data published on the Web, the majority of documents surveyed contain tree-based blank node structures. A small fraction contain complex blank node structures for which various tasks are potentially very expensive to compute.
The inability to match blank nodes increases the delta size and does not assist in detecting the changes between subsequent versions of a Knowledge Base. Building a mapping between the blank nodes of two compared Knowledge Bases that minimizes the delta size is NP-Hard in the general case. BNodeLand is a framework that deals with this problem and proposes solutions through particular tools.
Entailment checking
Regarding the entailment problem it is proved that deciding simple or RDF/S entailment of RDF graphs is NP-Complete, and deciding equivalence of simple RDF graphs is Isomorphism-Complete.