2.4 Pairs, Lists, and Scheme Syntax

Version: 4.1

2.4 Pairs, Lists, and Scheme Syntax

The cons function actually accepts any two values, not just a list for the second argument. When the second argument is not empty and not itself produced by cons, the result prints in a special way. The two values joined with cons are printed between parentheses, but with a dot (i.e., a period surrounded by whitespace) in between:

  > (cons 1 2)
  (1 . 2)
  > (cons "banana" "split")
  ("banana" . "split")

Thus, a value produced by cons is not always a list. In general, the result of cons is a pair. The more traditional name for the cons? function is pair?, and we’ll use the traditional name from now on.

The name rest also makes less sense for non-list pairs; the more traditional names for first and rest are car and cdr, respectively. (Granted, the traditional names are also nonsense. Just remember that “a” comes before “d,” and cdr is pronounced “could-er.”)

Examples:
  > (car (cons 1 2))
  1
  > (cdr (cons 1 2))
  2
  > (pair? empty)
  #f
  > (pair? (cons 1 2))
  #t
  > (pair? (list 1 2 3))
  #t

Scheme’s pair datatype and its relation to lists is essentially a historical curiosity, along with the dot notation for printing and the funny names car and cdr. Pairs are deeply wired into to the culture, specification, and implementation of Scheme, however, so they survive in the language.

You are perhaps most likely to encounter a non-list pair when making a mistake, such as accidentally reversing the arguments to cons:

  > (cons (list 2 3) 1)
  ((2 3) . 1)
  > (cons 1 (list 2 3))
  (1 2 3)

Non-list pairs are used intentionally, sometimes. For example, the make-immutable-hash function takes a list of pairs, where the car of each pair is a key and the cdr is an arbitrary value.

The only thing more confusing to new Schemers than non-list pairs is the printing convention for pairs where the second element is a pair, but is not a list:

> (cons 0 (cons 1 2))
(0 1 . 2)

In general, the rule for printing a pair is as follows: use the dot notation always, but if the dot is immediately followed by an open parenthesis, then remove the dot, the open parenthesis, and the matching close parenthesis. Thus, (0 . (1 . 2)) becomes (0 1 . 2), and (1 . (2 . (3 . ()))) becomes (1 2 3).

2.4.1 Quoting Pairs and Symbols with quote

After you see

> (list (list 1) (list 2) (list 3))
((1) (2) (3))

enough times, you’ll wish (or you’re already wishing) that there was a way to write just ((1) (2) (3)) and have it mean the list of lists that prints as ((1) (2) (3)). The quote form does exactly that:

  > (quote ((1) (2) (3)))
  ((1) (2) (3))
  > (quote ("red" "green" "blue"))
  ("red" "green" "blue")
  > (quote ())
  ()

The quote form works with the dot notation, too, whether the quoted form is normalized by the dot-parenthesis elimination rule or not:

  > (quote (1 . 2))
  (1 . 2)
  > (quote (0 . (1 . 2)))
  (0 1 . 2)

Naturally, lists of any kind can be nested:

  > (list (list 1 2 3) 5 (list "a" "b" "c"))
  ((1 2 3) 5 ("a" "b" "c"))
  > (quote ((1 2 3) 5 ("a" "b" "c")))
  ((1 2 3) 5 ("a" "b" "c"))

If you wrap an identifier with quote, then you get output that looks like an identifier:

> (quote jane-doe)
jane-doe

A value that prints like an identifier is a symbol. In the same way that parenthesized output should not be confused with expressions, a printed symbol should not be confused with an identifier. In particular, the symbol (quote map) has nothing to do with the map identifier or the predefined function that is bound to map, except that the symbol and the identifier happen to be made up of the same letters.

Indeed, the intrinsic value of a symbol is nothing more than its character content. In this sense, symbols and strings are almost the same thing, and the main difference is how they print. The functions symbol->string and string->symbol convert between them.

Examples:
  > map
  #<procedure:map>
  > (quote map)
  map
  > (symbol? (quote map))
  #t
  > (symbol? map)
  #f
  > (procedure? map)
  #t
  > (string->symbol "map")
  map
  > (symbol->string (quote map))
  "map"

When quote is used on a parenthesized sequence of identifiers, it creates a list of symbols:

  > (quote (road map))
  (road map)
  > (car (quote (road map)))
  road
  > (symbol? (car (quote (road map))))
  #t

2.4.2 Abbreviating quote with ’

If (quote (1 2 3)) still seems like too much typing, you can abbreviate by just putting ' in front of (1 2 3):

  > '(1 2 3)
  (1 2 3)
  > 'road
  road
  > '((1 2 3) road ("a" "b" "c"))
  ((1 2 3) road ("a" "b" "c"))

In the documentation, ' is printed in green along with the form after it, since the combination is an expression that is a constant. In DrScheme, only the ' is colored green. DrScheme is more precisely correct, because the meaning of quote can vary depending on the context of an expression. In the documentation, however, we routinely assume that standard bindings are in scope, and so we paint quoted forms in green for extra clarity.

A ' expands to a quote form in quite a literal way. You can see this if you put a ' in front of a form that has a ':

  > (car '(quote road))
  quote
  > (car ''road)
  quote

Beware, however, that the REPL’s printer recognizes the symbol quote when printing output, and then it uses ’ in the output:

  > 'road
  road
  > ''road
  'road
  > '(quote road)
  'road

2.4.3 Lists and Scheme Syntax

Now that you know the truth about pairs and lists, and now that you’ve seen quote, you’re ready to understand the main way in which we have been simplifying Scheme’s true syntax.

The syntax of Scheme is not defined directly in terms of character streams. Instead, the syntax is determined by two layers:

a read layer, which turns a sequence of characters into lists, symbols, and other constants; and
an expand layer, which processes the lists, symbols, and other constants to parse them as an expression.

The rules for printing and reading go together. For example, a list is printed with parentheses, and reading a pair of parentheses produces a list. Similarly, a non-list pair is printed with the dot notation, and a dot on input effectively runs the dot-notation rules in reverse to obtain a pair.

One consequence of the read layer for expressions is that you can use the dot notation in expressions that are not quoted forms:

> (+ 1 . (2))
3

This works because (+ 1 . (2)) is just another way of writing (+ 1 2). It is practically never a good idea to write application expressions using this dot notation; it’s just a consequence of the way Scheme’s syntax is defined.

Normally, . is allowed by the reader only with a parenthesized sequence, and only before the last element of the sequence. However, a pair of .s can also appear around a single element in a parenthesized sequence, as long as the element is not first or last. Such a pair triggers a reader conversion that moves the element between .s to the front of the list. The conversion enables a kind of general infix notation:

  > (1 . < . 2)
  #t
  > '(1 . < . 2)
  (< 1 2)

This two-dot convention is non-traditional, and it has essentially nothing to do with the dot notation for non-list pairs. PLT Scheme programmers use the infix convention sparingly – mostly for asymmetric binary operators such as < and is-a?.

top← prev up next →

1	Welcome to PLT Scheme
2	Scheme Essentials
3	Built-In Datatypes
4	Expressions and Definitions
5	Programmer-Defined Datatypes
6	Modules
7	Contracts
8	Input and Output
9	Regular Expressions
10	Exceptions and Control
11	Iterations and Comprehensions
12	Pattern Matching
13	Classes and Objects
14	Units (Components)
15	Reflection and Dynamic Evaluation
16	Macros
17	Performance
18	Running and Creating Executables
19	Compilation and Configuration
20	More Libraries
	Bibliography
	Index

2.1	Simple Values
2.2	Simple Definitions and Expressions
2.3	Lists, Iteration, and Recursion
2.4	Pairs, Lists, and Scheme Syntax

2.4.1	Quoting Pairs and Symbols with quote
2.4.2	Abbreviating quote with ’
2.4.3	Lists and Scheme Syntax