Effectiveness of Detection-based and Regression-based Approaches for Estimating Mask-Wearing Ratio

Author：Khanh-Duy Nguyen, Huy H. Nguyen, Trung-Nghia Le, Junichi Yamagishi, Isao Echizen

#画像処理
#その他

2021 IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)

Estimating the mask-wearing ratio in public places is important as it enables health authorities to promptly analyze and implement policies. Methods for estimating the mask-wearing ratio on the basis of image analysis have been reported. However, there is still a lack of comprehensive research on both methodologies and datasets. Most recent reports straightforwardly propose estimating the ratio by applying conventional object detection and classification methods. It is feasible to use regression-based approaches to estimate the number of people wearing masks, especially for congested scenes with tiny and occluded faces, but this has not been well studied. A large-scale and well-annotated dataset is still in demand. In this paper, we proposed two different methods for ratio estimation that are leveraged by either detection-based or regression-based approaches. For the detection-based approach, we improved a state-of-the-art face detector, RetinaFace, for the ratio estimation. For the regression-based approach, we utilized a baseline network, CSRNet, and finetuned it to estimate the density maps for masked and unmasked faces. We also proposed the first large-scale dataset, the “NFM,” which contains 581,108 face annotations extracted from 18,088 video frames in 17 street-view videos. The annotations (bounding boxes and labels), and pre-trained models will be released with the publication of our paper. Through experiments, the RetinaFace-based method achieves better accuracy under different situations, while the CSRNet-based method is superior in terms of operation time thanks to its compactness.

一覧へ戻る