Loading [a11y]/accessibility-menu.js
Unsupervised Gaze Representation Learning from Multi-view Face Images | IEEE Conference Publication | IEEE Xplore

Unsupervised Gaze Representation Learning from Multi-view Face Images


Abstract:

Annotating gaze is an expensive and time-consuming endeavor, requiring costly eye-trackers or complex geometric calibration procedures. Although some eye-based unsupervis...Show More

Abstract:

Annotating gaze is an expensive and time-consuming endeavor, requiring costly eye-trackers or complex geometric calibration procedures. Although some eye-based unsupervised gaze representation learning methods have been proposed, the quality of gaze representation extracted by these methods degrades severely when the head pose is large. In this paper, we present the Multi-View Dual-Encoder (MVDE), a framework designed to learn gaze representations from unlabeled multi-view face images. Through the proposed Dual-Encoder architecture and the multi-view gaze representation swapping strategy, the MV-DE successfully disentangles gaze from general facial information and derives gaze representations closely tied to the subject's eye-ball rotation without gaze label. Experimental results illustrate that the gaze representations learned by the MV-DE can be used in downstream tasks, including gaze estimation and redirection. Gaze estimation results indicates that the proposed MV-DE displays notably higher robustness to uncontrolled head movements when compared to state-of-the-art (SOTA) unsupervised learning methods.
Date of Conference: 16-22 June 2024
Date Added to IEEE Xplore: 16 September 2024
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.