Project 1: Color Alignment

Automatic color aligning and compositing of the Prokudin-Gorskii photo collection

Due Date: 11:59pm on Friday, February 5th, 2010

Brief

This handout: /course/cs195g/asgn/proj1/handout/
Stencil code: /course/cs195g/asgn/proj1/stencil/
Data: /course/cs195g/asgn/proj1/data/
Search through more data: Library of Congress
Browse through more data: Library of Congress
Handin: cs195g_handin proj1
Required files: README, code/, html/, html/index.html

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a photographer ahead of his time. He saw color photography as the wave of the future and came up with a simple idea to produce color photos: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter and then project the monochrome pictures with correctly coloured light to reproduce the color image; color printing of photos was very difficult at the time. Due to the fame he got from his color photos, including the only color portrait of Leo Tolstoy (a famous Russian author), he won the Tzar's permission and funding to travel across the Russian Empire and document it in 'color' photographs. His RGB glass plate negatives were purchased in 1948 by the Library of Congress. They have recently been digitized and made available on-line.

Requirements

Take the digitized Prokudin-Gorskii glass plate images and automatically produce a color image with as few visual artifacts as possible. You will need to extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image. The high-resolution images are quite large so you will need to have a fast and efficient aligning algorithm (read: Image Pyramid). You are required to implement a single-scale and multi-scale aligning algorithm that searches over a user-specified window of displacements. Also, you are required to try your algorithm on other images from the Prokudin-Gorskii collection.

Details

Important notes about the images:

The images are, from top to bottom, in BGR order.
Each image has a high and low res image available online, so consider trying your aligning algorithm on both.
Assume the negatives are evenly divided into 3 plates (ie, each plate is in exactly 1/3 of the negative).
Assume that a simple x,y translation model is sufficient for proper alignment.

MATLAB stencil code is available in /course/cs195g/asgn/proj1/stencil/. You're free to do this project in whatever language you want, but the TAs are only offering support in MATLAB.

There are some of the digitized glass plate images (both hi-res and low-res versions) in: /course/cs195g/asgn/proj1/data/.

Your program will take a glass plate image as input and produce a single color image as output. The program should divide the image into three equal parts and align the second and the third parts (G and R) to the first (B). For each image, you will need to record the displacement vector that was used to align the parts; make sure to state whether it is (y,x) or (x,y) because typically the first coordinate in MATLAB is the vertical component.

The easiest way to align the parts is to exhaustively search over a window of possible displacements (e.g. [-15,15] pixels), score each one using some image matching metric, and take the displacement with the best score. There are several possible metrics to measure how well images match:

Sum of squared differences: sum( (image1-image2).^2 )
Normalized cross correlation: dot( image1./||image1||, image2./||image2|| )

Note that in this particular case, the images to be matched do not actually have the same brightness values (they are different color channels), so other metrics might work better.

Exhaustive search will become prohibitively expensive if the pixel displacement is too large (which will be the case for high-resolution glass plate scans). In this case, you will need to implement a faster search procedure such as an image pyramid. An image pyramid represents the image at multiple scales (usually scaled by a factor of 2) and the processing is done sequentially starting from the coarsest scale (smallest image) and going down the pyramid, updating your estimate as you go. If you come up with another approach that's as fast or faster than an image pyramid, feel free to do that and make a note of it in your README.

Write up

For this project, and all other projects, you must do a project report in HTML. In the report you will describe your algorithm and any decisions you made to write your algorithm a particular way. Then you will show and discuss the results of your algorithm. Also discuss any extra credit you did. Feel free to add any other information you feel is relevant.

Extra Credit

Although the color images resulting from this automatic procedure will often look strikingly real, they are still not nearly as good as the manually restored versions available on the LoC website and from other professional photographers. However, each photograph takes days of painstaking Photoshop work, adjusting the color levels, removing the blemishes, adding contrast, etc. Can you come up with ways to address these problems automatically? Feel free to come up with your own approaches or talk to the Professor or TAs about your ideas. There is no right answer here, just try out things and see what works.

Some simple ideas:

Automatic cropping (remove whitespace, black or other color borders)
Automatic white balance
Automatic contrasting
Take your own photos and try your algorithm on them

Graduate Credit

To get graduate credit on THIS project you need to do at least 3 forms of extra credit. This will not be the case for all projects.

Web-Publishing Results

All the results for each project will be put on the course website so that the students can see each other's results. In class we will have presentations of the projects and the students will vote on who got the best results. If you do not want your results published to the web, you can choose to opt out. If you want to opt out, email cs195gtas[at]cs.brown.edu saying so.

Handing in

This is very important as you will lose points if you do not follow instructions. Every time after the first that you do not follow instructions, you will lose 5 points. The folder you hand in must contain the following:

README - text file containing anything about the project that you want to tell the TAs
code/ - directory containing all your code for this assignment
html/ - directory containing all your html report for this assignment (including images)
html/index.html - home page for your results

Then run: cs195g_handin proj1
If it is not in your path, you can run it directly: /course/cs195g/bin/cs195g_handin proj1

Rubric

+55 pts: Single-scale implementation
+35 pts: Multi-scale implementation
+10 pts: Write up
+20 pts: Extra credit (up to twenty points)
-5*n pts: Lose 5 points for every time (after the first) you do not follow the instructions for the hand in format

Final Advice

A lot of the suggested MATLAB code will be in the Image Processing Toolbox. If you plan to do this outside of the Sun Lab machines, you will need the Toolbox.
Don't get bogged down tweaking input parameters. Most, but not all images will line up using the same parameters.
The input images can be in jpg (uint8) or tiff format (uint16), remember to convert all the formats to the same scale (see im2double and im2uint8).
Shifting a matrix is easy to do in MATLAB by using circshift.
You can create the coordinates of the window you are shifting over by using meshgrid and turn that into a list of (x,y) pairs.
The borders of the images will probably hurt your results, try computing your metric on the internal pixels only
Output all of your images to jpg, it'll save you a lot of disk space

Credits

Project derived from Alexei A. Efros' Computational Photography course, with permission.