Abstract:
Real-world Facial Expression Recognition (FER) suffers from noisy labels due to ambiguous expressions and subjective annotation. Overall, addressing noisy label FER invol...Show MoreMetadata
Abstract:
Real-world Facial Expression Recognition (FER) suffers from noisy labels due to ambiguous expressions and subjective annotation. Overall, addressing noisy label FER involves two core issues: the efficient utilization of clean samples and the effective utilization of noisy samples. However, existing methods demonstrate their effectiveness solely through the generalization improvement by using all corrupted data, making it difficult to ascertain whether the observed improvement genuinely addresses these two issues. To decouple this dilemma, this paper focuses on efficiently utilizing clean samples by diving into sample selection. Specifically, we enhance the classical noisy label learning method Co-divide with two straightforward modifications, introducing a noisy label discriminator more suitable for FER termed IntraClass-divide. Firstly, IntraClass-divide constructs a class-separate two-component Gaussian Mixture Model (GMM) for each category instead of a shared GMM for all categories. Secondly, IntraClass-divide simplifies the framework by eliminating the dual-network training scheme. In addition to achieving the leading sample selection performance of nearly 95% Micro-F1 in standard synthetic noise paradigm, we first propose a natural noise paradigm and also achieve a leading sample selection performance of 82.63% Micro-F1. Moreover, we train a ResNet18 with the clean samples identified by IntraClass-divide yields better generalization performance than previous sophisticated noisy label FER models trained on all corrupted data.
Published in: IEEE Transactions on Biometrics, Behavior, and Identity Science ( Volume: 7, Issue: 1, January 2025)