skip to Main Content

I am reading Bratko’s Prolog: Programming for Artificial Intelligence. The easiest way for me to understand lists is visualising them as binary trees, which goes well. However, I am confused about the empty list []. It seems to me that it has two meanings.

  1. When part of a list or enumeration, it is seen as an actual (empty) list element (because somewhere in the tree it is part of some Head), e.g. [a, []]
  2. When it is the only item inside a Tail, it isn’t an element it literally is nothing, e.g. [a|[]]

My issue is that I do not see the logic behind 2. Why is it required for lists to have this possible ‘nothingness’ as a final tail? Simply because the trees have to be binary? Or is there another reason? (In other words, why is [] counted as an element in 1. but it isn’t when it is in a Tail in 2?) Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?

3

Answers


  1. In other words, why is [] counted as an element in 1. but it isn’t when it is in a Tail in 2?

    Those are two different things. Lists in Prolog are (degenerate) binary trees, but also very much like a singly linked list in a language that has pointers, say C.

    In C, you would have a struct with two members: the value, and a pointer to the next list element. Importantly, when the pointer to next points to a sentinel, this is the end of the list.

    In Prolog, you have a functor with arity 2: ./2 that holds the value in the first argument, and the rest of the list in the second:

    .(a, Rest)
    

    The sentinel for a list in Prolog is the special []. This is not a list, it is the empty list! Traditionally, it is an atom, or a functor with arity 0, if you wish.

    In your question:

    • [a, []] is actually .(a, .([], []))
    • [a|[]] is actually .(a, [])

    which is why:

    ?- length([a,[]], N).
    N = 2.
    

    This is now a list with two elements, the first element is a, the second element is the empty list [].

    ?- [a|[]] = [a].
    true.
    

    This is a list with a single element, a. The [] at the tail just closes the list.

    Question: what kind of list is .([], [])?

    Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?

    Yes, you can leave a free variable there; then, you have a “hole” at the end of the list that you can fill later. Like this:

    ?- A = [a, a|Tail], % partial list with two 'a's and the Tail
       B = [b,b], % proper list
       Tail = B. % the tail of A is now B
    A = [a, a, b, b], % we appended A and B without traversing A
    Tail = B, B = [b, b].
    

    You can also make circular lists, for example, a list with infinitely many x in it would be:

    ?- Xs = [x|Xs].
    Xs = [x|Xs].
    

    Is this useful? I don’t know for sure. You could for example get a list that repeats a, b, c with a length of 7 like this:

    ?- ABCs = [a,b,c|ABCs], % a list that repeats "a, b, c" forever
       length(L, 7), % a proper list of length 7
       append(L, _, ABCs). % L is the first 7 elements of ABCs
    ABCs = [a, b, c|ABCs],
    L = [a, b, c, a, b, c, a].
    

    In R at least many functions “recycle” shorter vectors, so this might be a valid use case.

    See this answer for a discussion on difference lists, which is what A and Rest from the last example are usually called.

    See this answer for implementation of a queue using difference lists.

    Login or Signup to reply.
  2. Your confusion comes from the fact that lists are printed (and read) according to a special human-friendly format. Thus:

    [a, b, c, d]
    

    … is syntactic sugar for .(a, .(b, .(c, .(d, [])))).

    The . predicate represents two values: the item stored in a list and a sublist. When [] is present in the data argument, it is printed as data.
    In other words, this:

    [[], []]
    

    … is syntactic sugar for .([], .([], [])).
    The last [] is not printed because in that context it does not need to. It is only used to mark the end of current list. Other [] are lists stored in the main list.

    I understand that but I don’t quite get why there is such a need for that final empty list.

    The final empty list is a convention. It could be written empty or nil (like Lisp), but in Prolog this is denoted by the [] atom.
    Note that in prolog, you can leave the sublist part uninstantiated, like in:

    [a | T]
    

    which is the same as:

    .(a, T)
    

    Those are known as difference lists.

    Login or Signup to reply.
  3. Your understanding of 1. and 2. is correct — where by “nothing” you mean, element-wise. Yes, an empty list has nothing (i.e. no elements) inside it.

    The logic behind having a special sentinel value SENTINEL = [] to mark the end of a cons-cells chain, as in [1,2,3] = [1,2|[3]] = [1,2,3|SENTINEL] = .(1,.(2,.(3,SENTINEL))), as opposed to some ad-hoc encoding, like .(1,.(2,3)) = [1,2|3], is types consistency. We want the first field of a cons cell (or, in Prolog, the first argument of a . functored term) to always be treated as “a list’s element”, and the second — as “a list”. That’s why [] in [1, []] counts as a list’s element (as it appears as a 1st argument of a .-functored compound term), while the [] in [1 | []] does not (as it appears as a 2nd argument of such term).

    Yes, the trees have to be binary — i.e. the functor . as used to encode lists is binary — and so what should we put there in the final node’s tail field, that would signal to us that it is in fact the final node of the chain? It must be something, consistent and easily testable. And it must also represent the empty list, []. So it’s only logical to use the representation of an empty list to represent the empty tail of a list.

    And yes, having a non-[] final “tail” is perfectly valid, like in [1,2|3], which is a perfectly valid Prolog term — it just isn’t a representation of a list {1 2 3}, as understood by the rest of Prolog’s built-ins.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search