Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments

Shinnosuke Isobe; Satoshi Tamura; Satoru Hayamizu

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments

Topics: Audio and Speech Analysis; Classification and Clustering; Deep Learning and Neural Networks; Image and Video Analysis and Understanding

In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods ICPRAM - Volume 1, 63-70, 2021

Authors: Shinnosuke Isobe ; Satoshi Tamura and Satoru Hayamizu

Affiliation: Gifu University, Gifu, Japan

Keyword(s): Speech Recognition, Audio-visual Processing, Canonical Correlation Analysis, Noise Robustness, Data Augmentation, Deep Learning.

Abstract: In this paper, we propose a method to improve the accuracy of speech recognition in noisy environments by utilizing Deep Canonical Correlation Analysis (DCCA). DCCA generates projections from two modalities into one common space, so that the correlation of projected vectors could be maximized. Our idea is to employ DCCA techniques with audio and visual modalities to enhance the robustness of Automatic Speech Recognition (ASR); A) noisy audio features can be recovered by clean visual features, and B) an ASR model can be trained using audio and visual features, as data augmentation. We evaluated our method using an audiovisual corpus CENSREC-1-AV and a noise database DEMAND. Compared to conventional ASR and feature- fusion-based audio-visual speech recognition, our DCCA-based recognizers achieved better performance. In addition, experimental results shows that utilizing DCCA enables us to get better results in various noisy environments, thanks to the visual modality. Furthermore, it i s found that DCCA can be used as a data augmentation scheme if only a few training data are available, by incorporating visual DCCA features to build an audio-only ASR model, in addition to audio DCCA features. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.21.248.47

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Isobe, S.; Tamura, S. and Hayamizu, S. (2021). Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments. In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-486-2; ISSN 2184-4313, SciTePress, pages 63-70. DOI: 10.5220/0010268200630070

@conference{icpram21,
author={Shinnosuke Isobe. and Satoshi Tamura. and Satoru Hayamizu.},
title={Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments},
booktitle={Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2021},
pages={63-70},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010268200630070},
isbn={978-989-758-486-2},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments
SN - 978-989-758-486-2
IS - 2184-4313
AU - Isobe, S.
AU - Tamura, S.
AU - Hayamizu, S.
PY - 2021
SP - 63
EP - 70
DO - 10.5220/0010268200630070
PB - SciTePress