【Machine Learning】【Python】Chapter 5: Sliding Window + SVM + NMS for Localization ---- "SVM Object Classification and Localization Detection"

It is recommended to use EdgeBoxes for bounding box prediction, which is faster and more accurate than SW. If you are interested, it is also recommended to use the anchor technique in the faster rcnn SPN layer.

The series of blog posts on "SVM Object Classification and Localization Detection" has come to an end.

Let me summarize the best method I used.

First, extract features using HoG.
Use PCA to reduce the dimensionality of the features, and then optimize the parameters C and gamma using PSO. The purpose of dimensionality reduction is to speed up the PSO operation, otherwise it would be too slow, but the classification performance would be reduced.
Train an initial SVM model using the features obtained in the first step and the parameters obtained in the second step.
Optimize the SVM model using Hard Negative Mining.
Perform detection using sliding windows, and finally perform bounding box regression using NMS.

Currently, there are 2700 positive samples and 2700 negative samples. After optimizing the parameters using PSO, the classification accuracy of the trained model is 84%. The test set consists of 1200 negative samples and 700 positive samples.

However, the accuracy of the detection boxes is not high, and there are many false detections.

------------[Updated on 2017.07.10]-------------------------------------------------------------------------------------------

I made some modifications to the detection code and processed the dataset. The SVM model, after optimizing the parameters using PSO, has a test accuracy of around 85.6%. After performing HNM with a step size of 100 pixels, the accuracy of the model will be around 83%, but the number of false detections is greatly reduced. Later, I performed HNM again with a step size of 50 pixels, and the current model has achieved very good detection results with few false detections. Although the correct detections are not stable, they already meet the requirements. I used a step size of 100 pixels, and finally trained the SVM model with 39,000 features, with a model size of 116M. I used a step size of 50 pixels, and finally trained the SVM model with over 90,000 features, with a model size of 190M.

By the way, I extracted over 2900 features using HoG for each image. When doing PSO, I used to reduce the dimensionality to 500 using PCA. In order to pursue better results, I later reduced the dimensionality to 2000 for PSO. Although it is much slower, the effect is slightly better.

**Latest code Github address: ** https://github.com/HansRen1024/SVM-classification-localization