Conferences >2021 IEEE International Confe...

Speaker-Independent Lipreading By Disentangled Representation Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

With the development of the deep learning technology, automatic lipreading based on deep neural network can achieve reliable results for speakers appeared in the training...Show More

Metadata

Abstract:

With the development of the deep learning technology, automatic lipreading based on deep neural network can achieve reliable results for speakers appeared in the training dataset. However, speaker-independent lipreading, i.e. lipreading for unseen speakers, is still a challenging task, especially when the training samples are quite limited. To improve the recognition performance in the speaker-independent scenario, a new deep neural network structure, named Disentangled Visual Speech Recognition Network (DVSR-Net), is proposed in this paper. DVSR-Net is designed to disentangle the identity-related features and the content-related features from the lip image sequence. To further eliminate the identity information that remained in the content features, a content feature refinement stage is designed in network optimization. By this way, the extracted features are closely related to the content information and irrelevant to the various talking style and thus the speech recognition performance for unseen speakers can be improved. Experiments on two widely used datasets have demonstrated the effectiveness of the proposed network in the speaker-independent scenario.

Published in: 2021 IEEE International Conference on Image Processing (ICIP)

Date of Conference: 19-22 September 2021

Date Added to IEEE Xplore: 23 August 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/ICIP42928.2021.9506396

Conference Location: Anchorage, AK, USA

Funding Agency:

Contents

References is not available for this document.

Speaker-Independent Lipreading By Disentangled Representation Learning

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Speaker-Independent Lipreading By Disentangled Representation Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?