Desugaring ParselTongue
In this assignment, you will desugar full ParselTongue to core ParselTongue. The core interpreter will be provided as a black-box; you will compose your desugarer with the core to produce a working implementation of the full language.
Materials
You should download two things for this assignment:
- The zip bundle binary for your architecture
- The template and test bundle
The template and test bundle should contain a directory called suite/
, with
several subdirectories, and two Racket files, typed-desugar-template.rkt
and
typed-lang.rkt
.
Your Task
Your task is to write a Racket program that implements a function called
desugar
, which converts between full ParselTongue and core ParselTongue.
These two definitions are given in typed-lang.rkt
First, ExprP
is the abstract syntax for full ParselTongue, which has
P-suffixed type constructors. These should obviously correspond to the
expression forms from the ParselTongue
specification. We're providing you with a
definition of, and an interpreter for, the language described by ExprC
, the
core language that ParselTongue uses.
You are going to write desugar, which should follow the following template:
(define (desugar [exprP : ExprP]) : ExprC
;; your code here
)
Your desugar is successful when it passes all the tests included in the binary, and runnable with --test-with-desugar. (We thank you all for providing such exhaustive tests in the first assignment!)
The rest of this document describes the core language, and has a walkthrough of how to get started with desugaring.
Features of the Core
The core language has a few features that don't show up in the surface language, and some features that warrant explanation:
TrueC
,FalseC
,NumC
, andStrC
simply evaluate to the obvious corresponding values, just likeNumC
in the language we've built up in class.ObjectC
creates an object value by evaluating all of the sub-expressions that make up the values of its fieldsGetFieldC
evaluates its subexpressions to values, and if they are an object and a string, provides the corresponding field in the object. It will also throw errors if non-objects or non-strings are provided, in correspondence with the specification.SetFieldC
evaluates its subexpressions to values, and if the first two are an object and a string, creates a new object with the string field replaced or added as appropriate. It will also throw errors if non-objects or non-strings are provided, in correspondence with the specification.FuncC
behaves almost just as thelamC
from the notes in chapter 8 - it closes over its environment and, when applied, creates a new location for each parameter. If more than one parameter of the same name are provided, the last one is bound to the argument in application.AppC
behaves just asappC
in the notes. It throws errors if the function position is not a function, and arity mismatch errors if the wrong number of arguments are provided. It evaluates its arguments left-to-right, and before it reports any arity mismatch errors.LetC
creates a new location, binds the provided identifier to the new location in the environment, and evaluates its body in that environment.IdC
evaluates to the value held at the location bound to the identifier in the current environment.Set!C
behaves assetC
in the notes - it replaces the value at the location bound to the identifier with the value that the right-hand-side evaluates to.(IfC test then else)
evaluates its test, and if that results in the false value, it evaluates theelse
branch. Otherwise, it evaluates thetrue
branch.SeqC
evaluates its first expression, throws out the result (but keeps side-effects), and evaluates the second expression.Prim1C
evaluates the unary operatorsprint
andtagof
:print
prints a string representation of its argument, as per the spectagof
is a primitive operation in the core that doesn't appear in the main language.tagof
takes a single argument, and returns a string representing its type:"object"
for objects"function"
for functions"string"
for strings"number"
for numbers"boolean"
for true and false
For example:
tagof(5) -> "number" tagof(+("asdf","foo")) -> "string" tagof({}) -> "object"
Prim2C
evaluates the binary operatorsnum+
,num-
,string+
,<
,>
, and==
num+
takes two number values and adds them together. Any non-numeric arguments result in an error with "Bad arguments for num+: ", followed by the string representations of its argumentsnum-
takes two number values and subtracts the second from the first. Any non-numeric arguments result in an error with "Bad arguments for num-: ", followed by the string representations of its argumentsstring+
takes two string values and concatenates them Any non-string arguments result in an error with "Bad arguments for string+: ", followed by the string representations of its arguments<
takes two number values and yieldstrue
if the first is smaller than the second, andfalse
if not. Any non-number arguments result in an error with "Bad arguments for <: ", followed by the string representations of its arguments>
takes two number values and yieldstrue
if the first is greater than the second, andfalse
if not. Any non-number arguments result in an error with "Bad arguments for >: ", followed by the string representations of its arguments==
takes any two values and compares them for equality using the Equal metafunction from the specification, returning true if they are Equal, and false otherwise.ErrorC
is an expression form that halts the program with a string message. The message is the same string thatprint
would output when given the same value, but the output occurs on stdout:(ErrorC (StrC "This is an error!")) -> "This is an error!" on stderr (ErrorC (ObjectC (list))) -> "object" on stderr
Getting Started, and Some Desugaring Tricks
Writing desugar is nontrivial, and we don't expect you to do it without help. For many cases, desugaring actually _is_ trivial, and handing the top level expression down basically unchanged is the right thing to do. For example, when desugaring number expressions `NumP`, we can just convert them directly to the number expression of the core:(define (desugar (exprP : ExprP)) : ExprC
(type-case ExprP exprP
[NumP (n) (NumC n)]
... more cases here ...
))
Other expressions, however, we express in terms of a combination of more fundamental features in the core. We'll go through two - while loops and arbitrary-arity subtraction.
Desugaring While Loops
To get you started, we're going to walk through what you need for
a pretty tricky desugaring - while
. Notice that WhileP
is defined
in ParselTongue (as an ExprP
), but is not defined at all in the core
language (there is no WhileC
). How do we pull off this desugaring?
Let's start by filling in a little bit of desugar
:
#lang plai-typed
(require "typed-lang.rkt")
(define (desugar [exprP : ExprP]) : ExprC
(type-case ExprP exprP
...
[WhileP (test body)
... fill in here ...]
)
What do we "fill in here?" Well, the specification has this to say:
- Evaluate test to a value v.
- If v is the value false, then yield false as the result of the whole expression
- Otherwise
- Evaluate body to a value v2.
- Evaluate test to a value v3.
- If v3 is the value false, then yield v2 as the result of the whole expression.
- Otherwise, go to sub-step 3.1 above and repeat.
We need to start by evaluating the test and checking if it's false.
That's pretty straightforward, we can us IfC
for that. Note the
recursive call to desugar
- we need to make sure we always have
expressions that are ExprC
's:
[WhileP (test body)
(IfC (desugar test)
... step 3 goes here ...
(FalseC)]
The next thing to fill in is the "Otherwise..." case of the specification. Let's try the straightforward transliteration of the spec again, evaluating body and then test again
[WhileP (test body)
(IfC (desugar test)
(SeqC (desugar body)
(IfC (desugar test)
... step 4 ...
... v2? ...
(FalseC)]
What should we put in place of v2
above? We could try putting the
desugared expression there again:
[WhileP (test body)
(IfC (desugar test)
(SeqC (desugar body)
(IfC (desugar test)
... step 4 ...
(desugar body)))
(FalseC)]
But this isn't a good idea. This program:
defvar x = 0
while(<(x,2)) {
x = +(x,1)
}
would result in x being bound to a location that contains 3 rather than 2, because the body's effects would be duplicated by the second occurrence of the body!
This is an important consideration in desugaring a language with mutation - we need to be really careful about understanding just what expressions we're putting where!
We need to store the result of evaluating
somewhere so we can only evaluate once. LetC
is perfect for this:
[WhileP (test body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
... step 4 ...
(IdC 'temp)))
(FalseC)]
Now, the value of evaluating the body is stored in the identifier temp
for us to access later.
On to step 4. We just need to go back to sub-step 3.1, which was to evaluate the body, then the test. We can just put the same thing in again:
[WhileP (test body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
... step 4 ...
(IdC 'temp)))
(IdC 'temp)))
(FalseC)]
Uh oh. We can't just nest these expressions infinitely; we need some way to bundle up this computation and use it again. Thankfully, we know how to do that - we set up recursion! We can bind a dummy value to a name that we use in the body of a function we create, and then set the name to the function after, like so:
[WhileP (test body)
;; dummy-fun will tell us it was called if we do so accidentally
(local ([define dummy-fun (FuncC (list) (ErrorC (StrC "Dummy function")))])
(IfC (desugar test)
;; while-var will hold the actual function once we tie
;; everything together
(LetC 'while-var dummy-fun
(LetC 'while-func
;; this function does the real work - it runs the body of
;; the while loop, then re-runs it if the test is true, and
;; stops if its false
(FuncC (list)
(LetC 'temp-var
(desugar body)
(IfC (desugar test)
(AppC (IdC 'while-var) (list))
(IdC 'temp-var))))
;; The Set!C here makes sure that 'while-var will resolve
;; to the right value later, and the AppC kicks things off
(SeqC (Set!C 'while-var (IdC 'while-func))
(AppC (IdC 'while-var) (list)))))
(FalseC)))]
This is (one way) to implement a While
loop using only functions and
mutation. Our loop has grown up! We'll leave for
loops to you.
Note that we've carefully chosen to use variables with "-"
in them
(while-var
, temp-var
) User code cannot have identifiers that contain "-"
.
What bad things could happen if we used identifiers that could overlap with
user code?
Desugaring Subtraction
The specification for ParselTongue allows any number of arguments to the
-
operator. If they are all numbers, it performs subtraction
left-to-right. If any of the arguments aren't numbers, it throws an
error. Our core only supports binary subtraction in the form of num-
,
so we clearly have some work to do here. Let's get the skeleton of the
case written down:
(define (desugar [exprP : ExprP]) : ExprC
...
[PrimP (op args)
(case op
['- ... fill in subtraction here ...])]
...
)
And let's have a look at the specification, as well:
To evaluate the expression
-(
expr1,
...,
exprn)
, for n > 0:
- Evaluate expr1, ..., exprn to a list of values v1, ..., vn, starting with expr1, from left to right.
- If n is 1 and v1 is a number, then yield v1.
- Otherwise, if all of v1, ..., vn are numbers, then
yield the result of applying the Racket
-
operation to v1, ..., vn - Otherwise, yield error("Bad arguments to -")
Note first that for 0 arguments, there is a special case:
[PrimP (op args)
(case op
['- (cond
[(= 0 (length args)) ... error for 0, find it in the spec ...]
[(< 0 (length args)) (desugar-subtract args)])])]
The actual work is going to get a little longer than we want to do here, so we've made a helper for ourselves:
;; somewhere else in the file
(define (desugar-subtract (args : (listof ExprP))) : ExprC
...)
What do we want to do here? Well, the specification tells us that we need to
evaluate all the arguments to values. And, we're going to need to use those
values later. LetC
is going to be helpful here again. But, we don't know up
front how many arguments there will be, and we're going to need an identifier
for each one. We need to create new identifiers on demand:
(define (make-ids (n : number)) : (listof symbol)
(build-list n (lambda (n) (string->symbol (string-append "var-" (to-string n))))))
Now we can at least have the identifiers that we need:
(define (desugar-subtract (args : (listof ExprP))) : ExprC
(local ([define ids (make-ids (length args))])
...))
Now we need to think about what we'll want these expressions to look like. We will at least want to let-bind each argument expression, so
-(e1, e2, e3)
should desugar to something like:
(LetC 'var1 (desugar e1)
(LetC 'var2 (desugar e2)
(LetC 'var3 (desugar e3)
... check for numbers and do the subtraction ...)))
We can fill in a bit more of what we want to check and do, also. Here's a
proposal, though we don't know how to write the stuff inside and
yet:
(LetC 'var1 (desugar e1)
(LetC 'var2 (desugar e2)
(LetC 'var3 (desugar e3)
(IfC (and (Prim2C '== (Prim1C 'tagof (IdC 'var1)) (StrC "number"))
(Prim2C '== (Prim1C 'tagof (IdC 'var2)) (StrC "number"))
(Prim2C '== (Prim1C 'tagof (IdC 'var3)) (StrC "number")))
(Prim2C 'num- (Prim2C 'num- (IdC 'var1) (IdC 'var2)) (IdC 'var3))
(ErrorC "Bad arguments to -")))))
See what we did here? We're using a combination of tagof
and ==
,
primitives we have available in the core, to check for number-ness of each of
the arguments that we've evaluated. We don't have a primitive for and
, so
we'll need to get to that. We also use num-
, the binary operation from the
core, to actually do the subtraction. Note that the order of operations in
num-
is important; we need to subtract the second from the first before
subtracting the third, otherwise
-(9,2,1)
might give us 8
back instead of 6
.
If we want to put this all together for an arbitrary number of expressions, we need a few pieces:
;; cascade-lets will build up the nested lets, and use body as the
;; eventual body, preserving order of evaluation of the expressions
(define (cascade-lets (ids : (listof symbol))
(exprs : (listof ExprC))
(body : ExprC)) : ExprC
(cond [(empty? ids) body]
[(cons? ids)
(LetC (first ids) (first exprs) (cascade-lets (rest ids) (rest exprs) body))]))
;; check-type builds an expression that checks the type of the expression
;; given as an argument
(define (check-type (expr : ExprC) (type : string)) : ExprC
(Prim2C '== (Prim1C 'tagof expr) (StrC type)))
;; and builds up an and expression from its two pieces
(define (and (expr1 : ExprC) (expr2 : ExprC)) : ExprC
(IfC expr1 expr2 (FalseC)))
;; all builds up a series of ands over the expression arguments
(define (all (exprs : (listof ExprC))) : ExprC
(foldl (lambda (exp result) (and exp result)) (TrueC) exprs))
;; map-subtract builds an expression that maps 'num- over a list of expressions
(define (map-subtract (exprs : (listof ExprC))) : ExprC
(foldl (lambda (expr result) (Prim2C 'num- result expr)) (first exprs) (rest exprs)))
Now we can put all of these things together:
(define (desugar-subtract (args : (listof ExprP))) : ExprC
(local ([define ids (make-ids (length args))]
[define id-exps (map IdC ids)])
(cascade-lets ids args
(IfC (all (map (lambda (e) (check-type e "number")) id-exps))
(map-subtract id-exps)
(ErrorC "Bad arguments to -")))))
So we've taken our list of args and desugared it to type-tests and binary
operators, preserving the right order of operation, side effects, and the error
case. Subtraction is one of the most complicated cases in the desugaring; you
should use what we've shown you here to build up what you need for +
, which
needs to make a decision between strings and numbers.
Getting Started From Here
The zip bundle, along with the appropriate binary, are enough to get started running your desugarer composed with our core interpreter. Your implementation won't do much initially, because it doesn't contain even the code for desugaring booleans and strings:
$ echo "true" | debian-dist/bin/assignment3-debian --interp-with-desugar typed-desugar-template.rkt
Haven't desugared a case yet:
(TrueC)
We've added the "Haven't desugared..." message in the else
clause of the
type-case so you'll be informed when you're trying to desugar a program and
haven't yet tackled a type of expression.
We did add the number case for you, so you can see subtraction working:
$ echo "-(9,2,1)" | debian-dist/bin/assignment3-debian --interp-with-desugar typed-desugar-template.rkt
6
You should start by filling in the cases in desugar
for the constants, and
then start building up the other expressions so you can test more and more.
Test Cases
We're providing a really big test suite for you, built out of 73 files of our own tests (in brown/), and 420 tests that are made up of 4 different submissions of the first assignment. These aren't necessarily the best submissions (we've only had time to look carefully at a fraction of the hundreds that were submitted!), but they are far from the worst, and cover a broad set of features well.
We're only going to grade you on passing or failing the tests in brown/ for this assignment, as that's all the grading script checks. You should try to hit all the tests in the contrib/ directories as well, because we are going to use an expanded test suite for the next assignment, where you'll be implementing the core of ParselTongue, and it will include some of the other 400+ tests.
Remember that you can use the --test-with-desugar
option to test single
subdirectories at a time (for example, the distributed tests have some
directories that contain only object tests). Feel free to move tests around
and write new ones that are specifically targeted at features you're working on
desugaring; the grading script doesn't care at all about the test directory
you downloaded, as it has all the tests it uses included internally.
Executable
The executable for this assignment has several options:
--stdin <filename>
Takes a filename to use as the standard input (useful for Windows
folks)
--stdout <filename>
Takes a filename to redirect standard output to (useful for Windows
users)
--stderr <filename>
Takes a filename to redirect standard err to (useful for Windows
users)
NOTE: --stdin, --stderr, and --stdout must come before other
options to have their effect.
--interp
Runs stdin with the correct interpreter. This is included mainly
for reference and convenience.
--interp-with-desugar <filename>
Expects <filename> to be the name of a Racket file that contains an
implementation of desugar (as described above). Composes the
desugarer there with the correct interpreter, reads the program from
stdin, and outputs the result on stdout/err.
Note on filenames for Windows users: The script complains if you have a
leading ".\" in front of the filename for your desugarer. Do this:
$ path\to\bin\assignment3-win32 --interp-with-desugar my-desugar.rkt
Not this:
$ path\to\bin\assignment3-win32 --interp-with-desugar .\my-desugar.rkt
--brief
Briefer output for test-with-desugar and test-with normal
--test-with-desugar <filename> <path>
Expects <filename> to be the name of a Racket file that contains
an implementation of desugar (as described above). Composes an
evaluator together as with --interp-with-desugar, but runs it on
all the tests at <path>.
--test-with-normal <path>
Runs all the tests at <path> with the normal interpreter. Useful for
reference and convenience.
--report <filename>
Generates grade report on standard out (or the argument to --stdout),
by loading the implementation of desugar from <filename>, and running
it on an internal set of tests.
Handin
At the handin link, submit:
- Your text file grade report
- A single Racket file that defines
desugar
, using the template above