Conferences >2023 26th Conference of the O...

Fine-tuning the Wav2Vec2 Model for Automatic Speech Emotion Recognition System

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The speech emotion recognition (SER) has gained pivotal attention on various applications in human-computer interaction and affective computing. In these days, there has ...Show More

Metadata

Abstract:

The speech emotion recognition (SER) has gained pivotal attention on various applications in human-computer interaction and affective computing. In these days, there has been a growing interest in developing robust and accurate systems for identifying emotions from speech utterances. In this work, a novel approach to Wav2Vec2 architecture is used to demonstrate the SER system performance. The Wav2Vec2 model is used to extract the speech features from utterances and feed to feed forward network to identify the emotions accurately on the two datasets, namely, Toronto emotional speech set (TESS) and Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D). Wav2Vec2 implements a contrastive learning target during the pre-training stage. The CREMA-D achieved an accuracy of 76%. Additionally, the weighted F1 score, precision, and recall were, respectively, 0.76, 0.77, and 0.77. On the other hand, on the TESS dataset achieved an accuracy of 99%, the F1 score was 0.99. Furthermore, the weighted recall and precision were both 0.99 and 0.99.

Published in: 2023 26th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)

Date of Conference: 04-06 December 2023

Date Added to IEEE Xplore: 02 April 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/O-COCOSDA60357.2023.10482992

Conference Location: Delhi, India

Contents

References is not available for this document.

Fine-tuning the Wav2Vec2 Model for Automatic Speech Emotion Recognition System

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Fine-tuning the Wav2Vec2 Model for Automatic Speech Emotion Recognition System

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?