Vincent Casser

Machine Learning Researcher, Software Engineer

Semantic Segmentation

Investigating Multiscale Class Activation Mapping for ConvNet-based Car Detection

V. Casser: “Investigating Multiscale Class Activation Mapping for ConvNet-based Car Detection.” Bachelor thesis, supervised by V. Steinhage and A. Weber, 2016.

In this paper we apply Class Activation Mapping (CAM) to the problem of car detection. For this, we perform weakly-supervised training of different Convolutional Neural Networks on large-scale datasets depicting cars and evaluate them on artificial and realistic image data. To detect also small object appearances, we combine CAM with a multiscale sliding window approach. We show that (i) the usage of CAM improves localization ability compared to the aggregation of results by a simple sliding window classifier considerably, and (ii) our multiscale sliding-window extension enhances the quality of localization results when focusing on small object appearances by increasing mapping resolution, although increasing computational costs. The presented car detector is robust and performs well on both artificial and challenging realistic data, achieving a F1-score of 0.82 or 0.77 in our metric, respectively.

Supplementary Material

Precision-recall for baseline (a) vs. proposed method (b).

The developed image annotation software to create groundtruth data.

Example results of our base classifier, positive and negative.

Example image annotation masks.