Brown CS: CSCI1730: Programming Languages: Prolog

Complete this assignment with the same team you worked with for Type Inference (Written). You and your partner must each understand the answers to all the problems, so don't just split up the work.

A Brief History of Prolog at Brown

Student: Are you doing Prolog again this year?
Shriram: Yes.
Student: That was the hardest assignment I ever did at Brown—and I even did Weenix!
Shriram: Well, I was thinking of scaling it down this year....
Student: No! You can't do that. It's like a rite of passage.

Overview

In this assignment, you will implement Prolog-style logic variables and backtracking search using Scheme's macros and continuations. Unlike the interpreters you've written so far, the "language" for this assignment is embedded in Scheme, and will use Scheme's environment for its environment, Scheme's store, etc.

The language has two principal features: logic variables and syntactic forms for construction search trees. Search trees are defined by the following grammar:

<search-expr> ::= =succeed
              |   =fail
              |   (=and <search-expr> ...)
              |   (=or <search-expr> ...)
              |   (== <logic-expr> <logic-expr>)
              |   <scheme-expr> ; that evaluates to a <search-expr>

The form (== <logic-expr> <logic-expr>) attempts to unify logic variables and Scheme values:

<logic-expr> ::= (_)
             |  (=cons <logic-expr> <logic-expr>)
             |   <scheme-expr> ; that evaluates to a flat value, list or logic variable

The expression (_) creates a fresh, anonymous logic variable. You will also define a derived form that creates named logic variables:

<search-expr> ::= ... as before ...
              |   (=var (id ...) <search-expr>)

Finally, you will define forms to query the search trees that are constructed by <search-expr>s:

(=find-all (id ...) <search-expr>) 
=> (list (list ('id <value>) ...) ...)

; n is a non-negative integer
(=find-some n (id ...) <search-expr>) 
=> (list (list ('id <value-1>) ...) ... (list ('id <value-n>) ...))

Search Trees and Logic Variables

=succeed succeeds exactly once.
=fail does not succeed.
(=or e ...) succeeds when any of its subexpressions succeed.
(=and e ...) succeeds when all of its subexpressions succeed.
(_) creates an anonymous logic variable.
(=var (id ...) <search-expr>) binds id ... to fresh logic variables and evaluates <search-expr> in the extended environment.
Note that =var is a derived form. You should be able to transform any =var expressions into a lambda with (_).
(== e1 e2) attempts to unify e1 with e2. e1 and e2 must be either Scheme values or logic variables. If e1 and e2 can be unified, the expression succeeds. If they cannot be unified, the expression fails. Unification must always terminate: you must implement an occurs check.

Two Scheme values can be unified if they are equal?. For example, (== 5 5) succeeds, whereas (== 0 1) fails. Unification for logic variables is trickier. If a logic variable is unbound, it assumes the value necessary to make the expression succeed. For example, the following expression succeeds:
```
(=var (x) (== x 0))
```
The logic variable x assumes the value 0. However, if a logic variable is already bound, it must be treated as the value its bound to. For example, the following expression fails:
```
(=var (x) (=and (== x 0) (== x 1)))
```

Some of these abstractions may be implementable as Scheme procedures, while others will require the use of macros. In either case, recall the pattern we developed in class for performing backtracking: the unification and search primitives (i.e. =succeed, =fail, ==, =or, and =and) consume a failure continuation (to be invoked on failure) and, if successful, return a success continuation (to allow resumption of the search). This pattern prevents the Prolog embedding from communicating ordinary program answers through procedure return values. Instead it uses logic variables, which you will need to update imperatively, taking care to restore when backtracking.

However, you may represent search-trees (i.e. <search-expr>) and logic variables with any data structure you choose.

Conducting a Search and Printing Results

In principle, evaluating a <search-expr> will construct a search tree, but will not execute a search. To conduct a search, you must specify which logic variables you are interested in and how many results to return. To this end, you must implement the following forms:

(=find-all (id ...) <search-expr>) binds id ... to logic variables, evaluates <search-expr> and returns a list of all the solutions to <search-expr>. Each solution is a list of variable bindings (one binding for each id), where a variable binding is a two-element list consisting of the quoted variable name (a symbol) and its value. Note that if the value is a list, it should be returned as a Scheme list.

A logic variable in a solution may be unbound. If so, you should print some placeholder value to represent its state:. For example:
```
(=find-all (x) =succeed)
=> '(([x g642]))
```
However, if a pair of unbound logic variables have been unified, you must ensure that in your results, they appear bound to the same placeholder. For example:
```
(=find-all (x y) (== x y))
=> '(([x g341] [y g341]))
```
The solutions should be returned in their order of discovery (by a left-to-right depth-first search), and each solution should list the variables in the order provided.
(=find-some n (id ...) <search-expr>) returns the first n solutions of <search-expr>. This allows us to query search trees with infinitely many solutions.

Note that there is nothing special about the variables bound by =find-all and =find-some. They are simply the logic variables whose values the user is interested in.

Data Structures

Your implementation must be able to cope with flat values and lists. Flat values (numbers, symbols, strings, empty, ...) are straightforward; you will be able to directly bind logic variables to them and unify them with logic variables.

Composite data structures may require more effort, particularly when they contain logic variables. You are only required to support lists (and therefore, lists containing logic variables). To this end, you may define =cons to create a pair, along with any additional processing you need.

All other list processing functions should be trivial to define with =cons. For example, this definition of =list should work:

(define =list
  (lambda args
    (if (empty? args)
        empty
        (=cons (first args) (apply =list (rest args))))))
        
(test (=find-all (x y) (== (=list x 5) (=cons 6 (=cons y empty))))
      '([(x 6) (y 5)]))

Testing

Test your code using the standard testing forms. (test, test/exn, etc.) Since the representation of search trees is implementation-specific, your test cases will probably all use =find-all and =find-some. Keep in mind that these expressions have a specified output format and return results in a predictable order.

Interactive Testing

printf: the weapon of champions

These are not required by the assignment, but to aid in interactive testing (as shown in the examples below), you may find the following two definitions helpful:

(define resumer-box (box 'resumer))

(define (next) ((unbox resumer-box) 'dummy))
(define-syntax show
  (syntax-rules ()
    [(show (vars ...) e)
     (=var (vars ...)
           (let/cc esc
             (let/cc fk
               (esc (let ([next (e fk)])
                      (set-box! resumer-box next)
                      (printf "~a: ~a~n" (quote vars) (lv-value vars))
                      ...
                      (void))))
             'fail))]))

Example 1: Simple Search

You might want to start by just testing =and and =or with success and failure (no logic variables). In this restricted setting, there are a couple of useful properties: namely, if e1 succeeds in n1 ways, e2 succeeds in n2 ways, etc., then (=or e1 e2 ...) succeeds in n1 + n2 + ... ways, and (=and e1 e2 ...) succeeds in n1 * n2 * ... ways. For instance,

(=and (=or =succeed =succeed =fail =succeed)
      (=or =fail =succeed =succeed))

succeeds in six ways. The first =or succeeds in three ways, the second =or succeeds in two ways, and the =and combines them in all possible ways. Likewise,

(=or (=and =succeed =succeed)
     (=and =fail =succeed =succeed)
     (=and =succeed))

succeeds in two ways. The first and third =ands each succeed in one way, and the =or explores both of them.

You can write test cases for this language that count the number for successes. For example:

(test (length
       (=find-all ()
                  (=and (=or =succeed =succeed =fail =succeed)
                        (=or =fail =succeed =succeed))))
      6)

Example 2: Family Trees

As a more complicated example, suppose we have the following definitions:

(define (=parent c p)
  (=or (=and (== c 'vito) (== p 'dom))
       (=and (== c 'sonny) (== p 'vito))
       (=and (== c 'michael) (== p 'vito))
       (=and (== c 'fredo) (== p 'vito))
       (=and (== c 'sophia) (== p 'michael))
       (=and (== c 'tony) (== p 'michael))))

(define (=ancestor X Y)
  (=or (=parent X Y)
       (=var (Z)
             (=and (=parent X Z)
                   (=ancestor Z Y)))))

Then we get the following query results:

> (show (x) (=ancestor x 'vito))
x: sonny
> (next)
x: michael
> (next)
x: fredo
> (next)
x: sophia
> (next)
x: tony
> (next)
'fail
> (=find-all (x) (=parent x 'michael))
(list (list (list 'x 'sophia))
      (list (list 'x 'tony)))
> (=find-some 3 (x y) (=ancestor x y))
(list (list (list 'x 'vito) (list 'y 'dom))
      (list (list 'x 'sonny) (list 'y 'vito))
      (list (list 'x 'michael) (list 'y 'vito)))

Non-deterministic Finite Automata

The textbook defines an automaton macro that allows you to define deterministic finite automata. For example, the following automaton is supposed to accept lists of the form c(a|d)*r:

(define m (automaton init
            [init : (c -> more)
            [more : (a -> more)
                    (d -> more)
                    (r -> end)]
            [end :]))

Unfortunately, automaton as specified in the text has a bug. One characterization of the bug is that automaton doesn't distinguish accepting states from other states. Fix the bug by making accepting states explicit. The interface for automata with explicit accepting states should be:

(automaton
  start-state
  (accept accept-state ...)
  (state : (label -> target-state) ...)
  ...)

Alter your implementation so that you can define non-deterministic finite automata. Non-determinism allows you to repeat a label on the right-hand side of a state's transition table. For example:

(automaton
  init
  (accept end)
  (init : (c -> more))
  (more : (a -> more)
          (a -> end)
          (d -> more)
          (d -> end)
          (r -> end))

Your NFA implementation should use your Prolog implementation to achieve non-determinism.

Handin

Turn in one file, prolog.ss, containing your Prolog implementation, your NFA macro and your test suite.