Optical Flow and Structure From Motion

Optical Flow and Structure From Motion
John Oberlin
12/12/2011

Overview

For this project, we used the optical flow of a sequence of images to track key points found in the first frame.

The key points were found using a Harris corner detectors. I kept the strongest 200 points.

Next, during each frame, the optical flow at the current location of each key point was used to find the predicted location of each point in the next frame.

Once the predicted 2D paths were determined, a structure from motion algorithm was used to predict the 3D positions of the keypoints.

Keypoint Selection

Writeup Requirement 1: An image of the selected keypoints overlayed on the first frame of the sequence.

Feature Tracking

In my optical flow implementation, I used imfilter for the convolution and ones([15 15]) for my box filter.

I initially used convolution (rather, correlation with 'corr') with [-1 0 1]' and [-1 0 1] to find the x and y derivatives. This gave poor results, though, so I switched to using gradient(img1) for this component.

For comparison I provided two versions of each image. The first is what is obtained using convolution with [-1 0 1]' and its transpose (labeled 'Simple Gradient'), the second is what is obtained using the Matlab gradient() function (labeled 'Built In').

Note that when tracking key points, it was necessary to keep their positions to super-pixel precision and to use interp2 to sample the optical flow values between pixels. This prevented unnecessary drift.

Optical Flow of First Frame, Simple Gradient:

Optical Flow of First Frame, Built In Gradient:

Note that the optical flow using the built in gradient appears less noisy.

A visualization of the tracked points, built in gradient:

Writeup Requirement 1: The 2D path of 20 random key points over the sequence of frames.
Simple:

Built In:

Writeup Requirement 2: An image of the key points which left the image overlayed on the first frame of the sequence.
Simple:

Built In:

Structure From Motion

After the predicted 2D motion of the key points was found, it was time to use the structure from motion algorithm to obtain the 3D positions and motions of the key points.

I performed a straightforward implementation of the principles outlined in Tomasi and Kanade 1992 and Morita and Kanade, 1997. The latter was somewhat more helpful :) .

As for implementation choices, I used the built in Matlab svd() functions and chol() functions for singular value and Cholesky decompositions, respectively.

Writeup Requirement 1: A plot of the predicted 3D locations of the tracked points from 3 different view points.
Simple:

Built In:

The mesh obtained using the built in gradient function is substantially more convincing.

Writeup Requirement 2: A plot of the predicted 3D path of the camera from 3 different view points.
Simple:

Built In:

Again, the path from built in gradient function looks more correct.

Triangle Mesh and Camera Direction Vectors:
Simple:

Built In:

Note how the 3D structure is not apparent in the mesh corresponding to the simple gradient.

Conclusion

Using the built in gradient function definitely gave better results. The movie which displayed the tracked points by overlaying them on each frame of the sequence appeared very sharp, and the points moved quite convincingly. This is in stark contrast to the movie obtained with the simple gradient calculation, which appeared sluggish and incorrect.

Additionally, the Cholesky decomposition failed when applied to the result obtained with the simple gradient, while it succeeded when applying it to the result obtained with the built in gradient.

Overall, the results were surprisingly good. In particular, one could believe that the predicted motion of the points was generated by the predicted motion of the camera.

0 / N
Prev | Home | Next