Reading Comments

On the course calendar, one reading per lecture is marked with a bold (C) symbol. For each of these papers, students are expected to submit brief comments about its strengths, its weaknesses, and the questions it raises. Don't summarize the whole paper; focus on issues you find most interesting. Your plain text review should have the following format:

The Good: 1-2 sentences. What is the most exciting or interesting model, idea, or technique described here? Why is it important? Don't just copy the abstract - what do you think?
The Bad: 1-2 sentences. What is the biggest weakness of this method, model, or algorithm? Problems may include weak empirical validation, missing theory, unacknowledged assumptions, applicability to a narrow range of problems, or ...
The Ugly: 1-2 sentences. What didn't you fully understand? What would you like to see explained or discussed in class? Feel free to highlight unclear sections, steps you didn't follow, assumed background knowledge you don't have, or ...

All comments should be posted to the course's Google discussion group, brown.course.csci.2950p.2011-fall.s01, by 8:00am on the day that paper is presented. Late comments will not be given credit, but students can skip comments for four readings over the course of the semester without penalty. When posting your comments, please reply to the thread created by the instructor for that particular reading. Comments are moderated, and will be posted to other class members after the submission deadline. Registered students are automatically members of this private group.

Reading Presentations

For each class, the discussion will be divided into three 25-minute segments. Each segment will focus on either a single conference paper, or part of a longer paper. Students should expect to give an overview presentation, and lead discussion, for two of these segments. Prof. Sudderth will lecture for the remainder of the class meeting time.

Do not try to quickly go through every detail - focus on describing the key concepts clearly. After class, each presenter should send the instructor the slides or notes used in their presentation. These will be posted on the course webpage.

You are of course welcome to reuse figures and derivations from the readings in your presentations. For some papers, the authors may have useful talk slides posted online. You can make use of these, but you are expected to provide your own perspective on the material, and should not just present an "old" talk unchanged. Any external sources used in your presentation must also be credited with an explicit citation.

Final Projects

The final project will count towards 70% of overall grades. Of these points, 10% will be based on a 1-3 page project proposal, due on November 7; 20% will be based on a short oral presentation, given on December 9; and 40% will be based on a technical report describing the results, due in December.

Projects which apply Bayesian nonparametric methods to the student's own research interests are particularly encouraged. Please feel free to discuss potential project ideas with the instructor. Some possible styles of project include:

Identify a BNP model suitable for a new application area, and explore baseline learning algorithms
Propose, develop, and experimentally test a new type of learning algorithm for some existing BNP model
Experimentally compare different models or algorithms on an interesting, novel dataset
Survey the latest advances in an area of BNP theory or application which is not covered by the course, and for which no such survey currently exists

Project Proposals

The project proposal should be at most 3 pages long, including all figures and references. We encourage, but do not require, you to use the NIPS LaTeX style file. Proposals must be submitted as a single pdf file, by e-mail to the instructor, before 11:59pm on Monday, November 7. Your proposal should contain the following information:

A clear description of the problem or application you intend to address. Why is it worth studying?
A discussion of related work, including references to at least three relevant research articles. Which aspects of your project are novel?
Except for literature surveys, an experimental evaluation protocol. How will you know that you've succeeded?
A concrete plan for accomplishing your project by the end of the course. What are the biggest challenges?
A figure illustrating a Bayesian nonparametric model which plays a role in your project. We recommend creating such figures in a vector drawing program, such as Adobe Illustrator, Inkscape, or Xfig.

Project Reports

The technical report should be between 8-12 pages long, in the style of top machine learning conferences. Although the results need not be sufficiently novel for publication, the presentation and experimental protocols should be of high quality. We encourage, but do not require, you to use the NIPS LaTeX style file. Reports must be submitted as a single pdf file, by e-mail to the instructor, before 11:59pm on Sunday, December 18. Your report should include:

A clear description of the problem addressed, and summary of related work with appropriate references.
A mathematically precise description of the statistical models and learning algorithms that you consider. For the parts of your project which are novel contributions, include derivations which are sufficiently detailed for knowledgable experts to reproduce your work.
To help verify that your statistical learning algorithm is working properly, at least one plot showing the learning objective (joint log-probability for an MCMC method, a log-likelihood bound for a variational method, etc.) as a function of the number of learning iterations.
Some sort of visualization of the learned model structure; summary performance numbers are not sufficient. For example, for many BNP models it is possible to plot the learned clusters or features, sample from the posterior or predictive distributions, visualize results on low-dimensional toy data, etc.
A description of implementation details, including references for any code that was adapted and reused, a high-level summary of the functionality that your code implements, the programming language(s) you used, etc.