PROSPECTUS

Programming Parallel and Distributed Systems

Computer Science 178, Spring 2002

Steven P. Reiss

COURSE OVERVIEW

A broad set of applications today requires more computing power than is generally available on a single machine. These applications include scientific computations such as weather forecasting, business applications such as web services and database, AI applications such as image processing, and theoretic applications such as cryptography.

These applications are generally addressed by using multiple processes to attack the same problem at one time. There are a variety of ways that multiple processes can be used and a range of techniques, algorithms, and tools that have been developed for using multiple processors in these ways. This course will cover this material from a practical point of view with an emphasis on learning to build practical systems.

The course will be broken up into five logical sections. The first will provide an overview of the problems involved in parallel and distributed computing, the types of approaches that have been taken, the types of applications that will be considered, and historical background.

The second part will cover the use of multiple processors in a shared memory environment as is found on advanced workstations and servers today. This will involve programming with multiple threads and multiple processes. It will cover the variety of synchronization primitives that have been developed. It will include one or more assignments that require the student to write a working multiple-process program.

The third part of the course will extend this to more general distributed systems. It will look at distributed architectures for client-server computing. It will consider mechanisms such as Java RMI, CORBA, and DCOM as well as Internet-based client-server computing using JSP/ASP, servlets, Javascript, and similar technologies. This part of the course will again require the students to write and debug an appropriate program that uses the covered techniques.

The fourth part of the course will cover parallel computation. It will look at various architectural models including networks of workstations, the IBM SP series, and the CM/5. It will consider the message passing mechanisms using MPI. It will discuss techniques such as load balancing, message bundling, and dynamic processor allocation. It will again include an assignment involving the construction of an appropriate program to illustrate the underlying concepts.

The final part of the course will cover the algorithms essential to the underlying applications and how they are implemented efficiently in the various forms of distributed and parallel computation. Here we will look at sorting and searching, matrix algorithms, solving differential equations, transaction processing, neural networks, and genetic algorithms. This portion of the course will be somewhat integrated into the previous four parts so that the various algorithms can be considered in each approach as appropriate.

Students coming out of the course will have a basic understanding of the problems and solutions needed in attempting to achieve maximal performance from today's and tomorrow's computers. They should acquire the basic skills needed to program applications on a variety of parallel and distributed architectures. Moreover, they should have a good sense of why this area of Computer Science is a fruitful one for further work.

In addition to the various programming assignments, there will be a final exam and a midterm covering the non-programming aspects of the course.

LECTURES

Class are scheduled on Tuesdays and Thursdays from 9:00-10:20 in Salomon 003. Lectures will generally involve going over and expanding material from the texts with an emphasis on how the concepts and techniques are actually used in real systems.

COLLABORATION POLICY

Students may, at their discretion, work together on the programming assignments. If they do so, the program that is turned in will be expected to be more sophisticated and complete. Moreover, all students working on an assignment are expected to be familiar with the design and implementation of the overall system. Suggested extensions for multi-person projects will be included in the homework assignments. The midterm and final are to be done individually with no collaboration.

GRADES

Grades will be determined from the six programming assignments (60%), class participation (10%), the midterm (10%), and the final (20%).

TEXTBOOKS

There are three text books for this course. The first, Andrews' Foundations of Multithreaded, Parallel, and Distributed Programming , will be used to supply much of the material for the distributed computing portions of the course and overview material for the parallel aspects. It will be supplemented with other readings and lectures for the internet portions of the course. The second text, Pacheco's Parallel Programming with MPI , will be used as a basic introduction to MPI. If you prefer reading the MPI manual or some other MPI description, you can probably substitute it for this text. The third text, Wilkinson and Allen's Parallel Programming , will be used to cover some of the in-depth issues involving parallel algorithms and techniques. It is listed as optional since we will be making only limited use of the material.

ACCOUNTS

The various programming assignments will be done on the Suns. For those who do not have Sun accounts, we will provide accounts appropriately. We will also provide the appropriate course package to plug into your account.

OFFICE HOURS

The TA for the course is Ioannis Tsochantaridis (it@cs.brown.edu). TA hours will be set up as needed. Mr. Reiss does not maintain office hours, but is generally available in his office from 8 to 5 weekdays, and is available by email as spr.