Oct 10, 2011
Sungmin Lee
In this assignment, I accomplished a boundary detection employing texture gradient. Different from other well-known edge detection algorithms such as Canny and Sobel, I took texture gradient acount into the cues beside intensities. This operation improved the result by suppressing similar texture elements in an image, and performed better than Canny and Sobel detections.
Even though Canny edge detection performs reasonably well and is widely used until these days,
we can improve its performance by considering texture of an image.
In this assignment, I considered two more cues for improvement: texture and brightness gradients.
One of the best ways to represent and suppress texture is using filter banks.
Filter banks are a filter set which generally contains variations of gaussian kernels.
In this project, I created a deriviative of gaussian by convolving sobel operator and gaussian kernel.
By rotating and scaling the filter, I employed 32 filters. (16 orientations x 2 scales)
Data Structure:
orientations[] (array of angles. 0 to 360)
sigma[] (array of sigmas. 1 to 2)
Functions:
gaussianFilter(width, height, sigma) - generate gaussian kernel
sobelFilter() - generate 3by3 sobel operator
applyConv(target, filter) - apply convolution(filter) to the target
rotate(target, degree) - rotate an image with some degree
Algorithm:
sobel_kernel[] ← sobelFilter()
for i from 1 to Number of orientations[] by 1
gau_kernel[] ← gaussianFilter(11, 11, sigma[i])
for i from 1 to Number of orientations[] by 1
temp ← applyConv(gau_kernel, sobel_kernel)
temp ← rotate(temp, orientations(j)
fb(j,i) ← temp
<Algorithm 1. Generating Filter Banks >
You can get this result by using the algorithm above:
< Fig 1. A set of filter bank >
Once you implemented a filter bank, you need to make a set of half-disc filters. This task is somewhat similar to the filter bank and this dramatically saves a calcaulation time for chi-square distance by doing filtering operation.
< Fig 2. A set of half-disc filter >
By convolving an each filter bank and a greyscaled input image, you can get a set of filtered textons. Since we have 32 filters in this example, we also have 32 textons by doing this. Each texton image represents each dimension which means we have 32-dimension texton images in this example. From these textons, we can yield one clustered texton dictionary by using k-means algorithm. Calculated texton dictionary will have a set of index of each cluster.
Input:
img (input image)
k (number of clusters)
fb[] (a set of filter bank)
Output:
tmap[] (texton diary)
Data Structure:
Functions:
reshape(target, row, col) - reshape a matrix into row x col size
kmeans(k, data) - apply kmeans algorithm to calculate cluster index
applyConv(target, filter) - apply convolution(filter) to the target
Algorithm:
for i from 1 to num of filter bank by 1
kmaps[] = reshape(applyConv(img, fb(i), 1, [])
temp = kmeans(k, kmaps)
tmap = reshape(temp, h, w)
<Algorithm 2. Calculate a texton dictionary >
< Fig 3. Visualized texton dictionary >
By using the texton dictionary we created, we can calculate the chi-square distance for each pixel. If the distributions are similar, it means the gradient should be small, and if it is large, it means the ditribution is dissimilar. We can reduce the calculation expense by applying half-disc masks we created. The equation of chi-square distance is :
chi_sqr(g, h) = 1/2 * sumi=1:K( (gi-hi)2 / (gi+hi) )
Input:
img (input image)
binvalues (number of clusters)
masks[] (a set of half-disk masks)
Output:
gradient[] (texture gradient)
Data Structure:
Functions:
logicMatrix(target, num) - make a logic matrix with the num value
applyConv(target, filter) - apply convolution(filter) to the target
getChiDistance(g, h) - get Chi-square distance
Algorithm:
for i from 1 to num of mask orientations by 1
for j from 1 to num of mask radius by 1
for k from 1 to num of clusters by 1
tmp ← logicMatrix(img, k)
g_i ← applyConv(tmp, masks(i,j,1))
l_i ← applyConv(tmp, masks(i,j,2))
gradient[index] ← gradient[index] + (g_i - h_i)^2 / (g_i+h_i+eps)
//where eps is a very small value not to divide by 0
index ++
gradient &larr gradient/2
You can apply this texture gradient by applying element-wise product to normal Canny or Sobel detection.
This is a very simple approach, but this improves the result a lot.
Here is the compared result to Canny and Sobel edge detections.
< Fig 4. Final result of myPb edge detection >
Here are few art works I generated by using this project.
< Fig 5. 10 images of ground truth and their edges >