research-article

Stressed Speech Recognition Using Smartphone and Embedded Device Integration

Authors:

Barlian Henryranu Prasetio,

Muhammad Nabil Aljufri,

Dahnial Syauqy,

Edita Rosana WidasariAuthors Info & Claims

SIET '23: Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology

Pages 599 - 607

https://doi.org/10.1145/3626641.3626682

Published: 27 December 2023 Publication History

SIET '23: Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology

Stressed Speech Recognition Using Smartphone and Embedded Device Integration

Pages 599 - 607

Abstract
References

Abstract

Stress is a state of emotional tension generated by a variety of factors such as work, study, family, and others. Stress can worsen and have an impact on health if it is not managed soon. Several studies have presented methods for detecting people's emotions through their voices. The goal of this study is to determine whether someone is stressed and how much stress he is under by listening to his voice. It is expected to be able to assess the amount of stress through sound utilizing MFCC feature extraction and artificial neural network machine learning. This system is powered by a Raspberry Pi 4 connected through Bluetooth to a microphone and an application on an Android phone. The smartphone application was designed to integrate with the embedded system and to display the prediction result. The dataset used in this research was SUSAS (Speech Under Simulated and Acute Stress) consisting of 1860 utterances. During the development of the artificial neural network model using 70% as training dataset, the accuracy only achieved 76%. However, the accuracy of the overall integrated system by utilizing 30 data taken from dataset reached 90%. Meanwhile, the test's average computing time is 2 seconds.

References

[1]

Indonesian Republic Ministry of Health, “Beberapa kondisi yang dapat terjadi jika mengalami stres berkepanjangan.”

[2]

N. Matsuo, S. Hayakawa, and S. Harada, “Technology to detect levels of stress based on voice information,” Fujitsu Scientific and Technical Journal, vol. 51, no. 4, 2015.

[3]

F. Shah A, R. Sukumar A, and B. Anto, “Automatic Stress Detection from Speech by Using Discrete Wavelet Transforms,” in World Congress on Nature & Biologically Inspired Computing, Nov. 2009.

[4]

B. H. Prasetio, H. Tamura, and K. Tanno, “Deep time-delay Markov network for prediction and modeling the stress and emotions state transition,” Sci Rep, vol. 10, no. 1, 2020.

[5]

B. H. Prasetio, H. Tamura, and K. Tanno, “Semi-supervised deep time-delay embedded clustering for stress speech analysis,” Electronics (Switzerland), vol. 8, no. 11, 2019.

[6]

B. H. Prasetio, H. Tamura, and K. Tanno, “Ensemble Support Vector Machine and Neural Network Method for Speech Stress Recognition,” in 2018 International Workshop on Big Data and Information Security, IWBIS 2018, 2018.

[7]

B. H. Prasetio, D. Syauqy, and E. R. Widasari, “Hilbert-Huang Mel Frequency Cepstral Coefficient for Speech Stress Recognition System,” in Proceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022, 2022.

[8]

B. H. Prasetio, E. R. Widasari, and F. A. Bachtiar, “A Study of Machine Learning Based Stressed Speech Recognition System,” International Journal of Intelligent Engineering and Systems, vol. 15, no. 4, 2022.

[9]

B. L. Perrine, “THE INFLUENCE OF STRESS ON THE VOICE,” Graduate College of Bowling Green State University, 2018.

[10]

S. Ali, S. Tanweer, S. Khalid, and N. Rao, “Mel Frequency Cepstral Coefficient: A Review,” 2021.

[11]

H. Trang, T. H. Loc, and H. B. H. Nam, “Proposed combination of PCA and MFCC feature extraction in speech recognition system,” in International Conference on Advanced Technologies for Communications, 2015.

[12]

N. Widanti, B. Sumanto, P. Rosa, and M. Fathur Miftahudin, “Stress level detection using heart rate, blood pressure, and GSR and stress therapy by utilizing infrared,” in 2015 International Conference on Industrial Instrumentation and Control, ICIC 2015, 2015.

[13]

A. Mishra, D. Patil, N. Karkhanis, V. Gaikar, and K. Wani, “Real time emotion detection from speech using Raspberry Pi 3,” in Proceedings of the 2017 International Conference on Wireless Communications, Signal Processing and Networking, WiSPNET 2017, 2018.

[14]

K. Tomba, J. Dumoulin, E. Mugellini, O. A. Khaled, and S. Hawila, “Stress detection through speech analysis,” in ICETE 2018 - Proceedings of the 15th International Joint Conference on e-Business and Telecommunications, 2018.

[15]

H. Heriyanto, S. Hartati, and A. E. Putra, “EKSTRAKSI CIRI MEL FREQUENCY CEPSTRAL COEFFICIENT (MFCC) DAN RERATA COEFFICIENT UNTUK PENGECEKAN BACAAN AL-QUR'AN,” Telematika, vol. 15, no. 2, 2018.

[16]

A. Arnautovic and E. Teskeredzic, “Evaluation of Artificial Neural Network Inference Speed and Energy Consumption on Embedded Systems,” in 2021 20th International Symposium INFOTEH-JAHORINA, INFOTEH 2021 - Proceedings, 2021.

[17]

H. Dolka, M. V. Arul Xavier, and S. Juliet, “Speech emotion recognition using ANN on MFCC features,” in 2021 3rd International Conference on Signal Processing and Communication, ICPSC 2021, 2021.

[18]

B. L. Seaward, Managing stress: Principles and strategies for health and well-being (8th ed). 2015.

[19]

L. Muda, M. Begam, and I. Elamvazuthi, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques,” J Comput, vol. 2, no. 3, Mar. 2010.

[20]

R. Firmansyah, E. C. Djamal, and R. Yuniarti, “Identifikasi Nada Dari Sinyal Suara Alat Musik Instrumen Menggunakan Metode Mel Frequency Cepstrum Coefficients dan Hidden Markov Model,” Seminar Nasional Aplikasi …, 2018.

[21]

R. Dastres and M. Soori, “Artificial Neural Network Systems,” International Journal of Imaging and Robotics (IJIR), vol. 2021, no. 2, 2021.

[22]

J. H. L. Hansen and S. E. Bou-Ghazale, “Getting Started with SUSAS: A Speech Under Simulated and Actual Stress Database,” in 5th European Conference on Speech Communication and Technology, EUROSPEECH 1997, 1997.

Index Terms

Stressed Speech Recognition Using Smartphone and Embedded Device Integration
1. Applied computing
  1. Life and medical sciences
    1. Health informatics

Recommendations

Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
TSD 2015: Proceedings of the 18th International Conference on Text, Speech, and Dialogue - Volume 9302

Whisper is an alternative way of speech communication especially when a speaker does not want to reveal the information other than the target listeners. Generally, speaker-specific information is present in both excitation source and vocal tract system. ...
Investigation Amazigh speech recognition using CMU tools

The aim of this paper is to describe the development of a speaker-independent continuous automatic Amazigh speech recognition system. The designed system is based on the Carnegie Mellon University Sphinx tools. In the training and testing phase an in ...
Robust Arabic speech recognition in noisy environments using prosodic features and formant

This paper investigates the contribution of formants and prosodic features such as pitch and energy in Arabic speech recognition under real-life conditions. Our speech recognition system based on Hidden Markov Models (HMMs) is implemented using the HTK ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SIET '23: Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology

October 2023

722 pages

ISBN:9798400708503

DOI:10.1145/3626641

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SIET 2023

SIET 2023: International Conference on Sustainable Information Engineering and Technology

October 24 - 25, 2023

Badung, Bali, Indonesia

Acceptance Rates

Overall Acceptance Rate 45 of 57 submissions, 79%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
17
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten