CSCI 1430 Project 4
1. Build a set of positive example crops and
negative example crops
Crop Features: The SIFT feature set was used to describe each
image crop for this experiment, since it encapsulates image gradients,
creating a richer, lower dimension representation for each crop. Initially
crops were extracted from images and then converted to SIFT features, but
for faster runtime, the middle step was removed. Instead of extracting
crops from the images, SIFT features were sampled from the image at
the location where the crop would have been centered. This eliminates
unnecessary crop extraction and repeated calculation of SIFT
features for overlapping crops.
The baseline training set for the face detection SVM includes random positive examples and random negative examples. The random negative examples are taken from a database of non-face images, so they are guaranteed to be negatives. A "hard negative" is a false positive detection, meaning that a classifier marked a non-face crop as a face. In theory, an SVM's learning can be refined by iteratively incorporating these hard negatives into the training set and retraining the SVM. To achieve this mechanism, the SVM is first trained on random negative crops, establishing a baseline. Then, the trained SVM evaluates a set of strictly non-face images, so that any positives it returns are guaranteed to be false positives. A subset of the collected false positives are then incorporated into the negative training crops, and the SVM is retrained. This approach should create a more refined SVM at each iteration, since difficult features that it misclassifies are disambiguated and then explicitly trained into it. In practice, it seems that the improvement that mining hard negatives produces is relatively small.
Results: Both linear and non-linear SVMs were trained, and even with many fewer training examples, the non-linear SVM outperformed the linear SVM. Linear SVMs are able to take advantage of larger training sets, but non-linear SVMs can create more complex classification boundaries, allowing for more nuanced appreciation of image features. So even with fewer training examples, the non-linear SVM can extract more valuable discrimination information. (Linear SVM, 2000 positive crops, 2000 random negative crops)
(Non-linear SVM, 250 positive crops, 250 random negative crops)
Incorporating the hard negative features into the training set created little, in this case negative, improvement over the same number of random negative examples. (Non-linear SVM, 500 positive crops, 1000 random negative crops)
For incorporating hard negatives, two strategies were attempted: first, expand the set of negative crops, or second, replace half of the random negative crops. Both performed worse than the baseline without hard negatives, and between the two strategies, expansion worked better since it preserved the negative crops that were used to train the SVM and ultimately to produce the hard negatives. In other words, including the hard negatives worked best as a way to build off of the existing trained SVM, refining its discriminations, rather than to create a new SVM with new discrimination flaws.
(Linear SVM, 2000 positive crops, 2000 random negative crops)
(Linear SVM, 2000 positive crops, 1000 random negative crops (half of original negative crops), 1000 hard negative crops)
The best performance that was achieved hovered around 76% accuracy. Examining the false positives helps to reveal the features that the SVM must be sensitive to, mainly circles or circular areas with detail towards the center, as can be seen below.
(Non-linear SVM, 1000 positive crops, 1000 random negative crops)
False Positives
|