You must complete this program with the same partner you had for "Written: Continuations". This will be your team for the next few assignments.
Garbage Collection
In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap.
Writing your Garbage Collectors
Your garbage collectors must be written in the GC Collector Scheme language level. This langauge defines an interface to a program's stack and heap that you will use to implement garbage collection.
Your garbage collectors must implement the following functions:
init-allocator :: -> void
The mutator implicitly calls this function after initializing the heap and before calling any allocation routines. It is essentially a callback into the allocator indicating that the heap is ready.gc:cons :: first-addr rest-addr -> cons-addr
returns the address of a new cons cell with the given field addresses; the field addresses are presumed to be already allocatedgc:cons? :: addr -> boolean
returns a boolean that indicates whether the given address refers to a cons cellgc:first :: cons-addr -> first-addr
andgc:rest :: cons-addr -> rest-addr
return the fields of a given consgc:set-first! :: cons-addr first-addr -> void
andgc:set-rest! :: cons-addr rest-addr -> void
set the addresses of the first and last pointers of a cons cellgc:alloc-flat :: atomic -> number
allocates space for a flat value and returns the base address of the allocated blockgc:flat? :: addr -> boolean
returns a boolean that indicates whether the given address refers to an atomic valuegc:deref :: addr -> atomic
returns the atomic value stored at the given address
To help you write these functions, the GC Collector Scheme language defines an interface for the heap and the roots:
heap-size :: -> number
the size of the heapheap-offset :: -> addr
the base address of the heap where allocation should beginheap-set! :: addr atomic -> void
stores a value at the specified addressheap-ref :: addr -> atomic
returns the value at the specified addressget-root-set :: -> listof root
returns the roots of the collectionread-root :: root -> addr
returns the address the root referencesset-root! :: root addr -> void
updates the root to reference the specified address
Note that get-root-set
accepts an arbitrary number of
identifiers that it treats as roots and returns as part of the root set. That
is, read-root
returns the cell address stored in the location
referenced by the identifiers and set-root!
mutates the identifier
to reference a new cell address. You may find this useful when you implement
gc:cons
; you may wish to add the first
and
last
arguments to your root set in lieu of manually updating them if they move during garbage collection.
Testing your Garbage Collectors
You may write programs that exercise your garbage collectors using the GC Mutator Scheme language. This language is a subset of Scheme that uses a garbage collector that you specify. The first line of a test program must be:
(allocator-setup "collector.ss" heap-size heap-offset)
"collector.ss"
must be the name of your collector's file. heap-size
is the size of the heap your collector will use. heap-offset
is the base address on the heap where allocation should start (usually 0
).
The remainder of the program is in a subset of Scheme with numbers, symbols, lists, etc. The primitives of the language map directly to the procedures you define in your garbage collector.
Sample Code
To get you started, we've provided a sample mutator and a trivial collector that signals an error when the heap fills up. (In fact, our collector does signal an error with the mutator we've provided, if you don't increase the size of the heap.)
Notes
You must store bookkeeping data on the heap provided by GC Collector Scheme. You may store 2-3 atomic values, such as addresses into the heap as variables in your garbage collector. We will assume they represent machine registers. However, all compound data structures must be on the heap.
In particular, in the mark & sweep collector, you should maintain a free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!
Some final words of advice:
- You will want to test your program with small heap sizes, so that the garbage collector runs frequently.
- You do not need to compact memory in mark-and-sweep.
- You may find it convenient to use the Scheme construct set!, which allows you to mutate a variable without using boxes. We recommend you use this feature only in one instance: when swapping the semispaces in the stop & copy collector. This will save you the trouble of unboxing the heap every time you use it.
Handin
A single member of your team should handin the assignment. Please include a README file that lists your team's members. From the directory containing the files for the assignment you wish to hand in, execute
/course/cs173/bin/cs173handin gc