Hari Narayanan
Goals
This project will recover the 3D structure of an object by analyzing a video of its movement. The basic pipeline
consists of corner detection, point tracking, and finally a structure from motion algorithm.
Algorithm
The first step of the process is to select a number of keypoints and track them through the animation. We use a
Harris corner detector to select feature points, and use the Kanade-Lucas-Tomasi algorithm to track them. To do
this, we need to compute optic flow between consecutive frames of the animation. This process makes three central
assumptions about the input:
- Brightness constancy: a point has the same brightness in each frame it appears in.
- Smoothness: a point doesn't move far from its position in the previous frame.
- Spatial coherence: each point's neighbors move in a similar direction and magnitude as itself
Optic flow
Consider a point (x,y,t). By the brightness constancy assumption,
However, we can't solve this equation as is, so we make a linear approximation:
As we have two unknowns and only one equation, we use spatial coherence and assume that each point in a 30x30
neighborhood of (x,y) also moves with displacement <u,v>, giving us 225 equations and two
unknowns. We can use the least squares projection to find an approximate solution for u and v.
This is the optic flow of the animation at point (x,y,t).
Point tracking
Treating optic flow as a vector field, we can use Newton's method to figure out where each interest point is in
the next frame. Since points aren't necessarily integers when we compute their displacement, we need to use
interpolation. Finally, if a point moves out of the boundary of the image, we remove the entire point from the
tracker.
Structure from motion
This section of the project follows the method described in Morita and Kanade, 1997. We build a 2F x P measurement matrix and decompose it into the product of a rotation matrix and M and a structure matrix S.
Notes
There were a number of nodes outside of the hotel structure indicated as interest points, so I pruned these points out before tracking.
Results
Plot of 20 random points and their tracking paths:
Plot of the points that go off-camera and their tracking paths:
Three different views of the 3D structure of the object:
X, Y, and Z plots of camera movement over time: