Proceedings of the 13th IEEE International Conference on Advanced Video and Signal Based Surveillance, IEEE, 2016., IEEE, 2016.
13th IEEE International Conference on Advanced Video and Signal Based Surveillance, Colorado Springs, USA, August 23 - 26, 2016
Security and safety applications such as surveillance or forensics demand face recognition in low-resolution video data. We propose a face recognition method based on a Convolutional Neural Network (CNN) with a manifold-based track comparison strategy for low-resolution video face recognition. The low-resolution domain is addressed by adjusting the network architecture to prevent bottlenecks or significant upscaling of face images. The CNN is trained with a combination of a large-scale self-collected video face dataset and large-scale public image face datasets resulting in about 1.4M training images. To handle large amounts of video data and for effective comparison, the CNN face descriptors are compared efficiently on track level by local patch means. Our setup achieves 80.3 percent accuracy on a 32x32 pixels low-resolution version of the YouTube Faces Database and outperforms local image descriptors as well as the state-of-the-art VGG-Face network in this domain. The superior performance of the proposed method is confirmed on a self-collected in-the-wild surveillance dataset.