Real time implementation of voice based robust person authentication using T-F features and CNN

Revathi, A.; Sasikaladevi, N.; Raju, N.

doi:10.1007/s11042-023-16811-x

Real time implementation of voice based robust person authentication using T-F features and CNN

Published: 18 September 2023

Volume 83, pages 31587–31601, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

A. Revathi¹,
N. Sasikaladevi² &
N. Raju¹

148 Accesses
Explore all metrics

Abstract

A forensic investigation uses personal traits to identify the persons involved in criminal offences. In this work on person authentication, the recorded voice samples can also be used to narrow down the search to identify persons. Time-frequency (T-F) features obtained from the concatenated training set of utterances are given to the convolutional neural networks (CNN), with layers configured for creating templates. Testing utterances are tied, and T-F features are derived. These features are applied to the CNN templates, and based on the match claimed, recognition accuracy is computed to validate the feature selection and CNN technique. Decision-level fusion of features with CNN for modelling and classification provides an overall authentication rate of 98%. This system is also implemented in real-time using Raspberry Pi hardware. This automated system would be helpful in identifying convicts in forensic sectors and perform secured online transactions against fraudulent attacks in financial sectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text-Independent Voice Authentication System Using MFCC Features

Voice Identification of Spanish-Speakers Using a Convolution Neural Network in the Audio Interface of a Computer Attack Analysis Tool

Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

Data availability

All relevant data are within the paper and its supporting information files.

References

Abdel-Hamid O, Mohamed AR, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio, Speech, Lang Process 22(10):1533–1545
Article Google Scholar
Albuquerque RQ, Mello CA (2021) Automatic no-reference speech quality assessment with convolutional neural networks. Neural Comput Appl 33:9993–10003
Article Google Scholar
Bigun J, Fierrez-Aguilar J, Ortega-Garcia J, Gonzalez-Rodriguez J (2003) Multimodal biometric authentication using quality signals in mobile communications. In 12th International Conference on Image Analysis and Processing, 2003. Proceedings. (pp 2-11). IEEE
Das RK, Jelil S, Mahadeva Prasanna SR (2017) Development of multi-level speech based person authentication system. J Signal Process Syst 88:259–271
Article Google Scholar
Dey S, Barman S, Bhukya RK, Das RK, Haris BC, Prasanna SM, Sinha R (2014) Speech biometric based attendance system. In 2014 twentieth national conference on communications (NCC) (pp 1-6). IEEE
Duc B, Bigün ES, Bigün J, Maître G, Fischer S (1997) Fusion of audio and video information for multi modal person authentication. Pattern Recog Lett 18(9):835–843
Article ADS Google Scholar
Gonzalez-Huitron V, León-Borges JA, Rodriguez-Mata AE, Amabilis-Sosa LE, Ramírez-Pereda B, Rodriguez H (2021) Disease detection in tomato leaves via CNN with lightweight architectures implemented in Raspberry Pi 4. Comput Electronics Agric 181:105951
Article Google Scholar
Gunawan TS, Mokhtar MN, Kartiwi M, Ismail N, Effendi MR, & Qodim H (2020) Development of voice-based smart home security system using google voice kit. In 2020 6th International Conference on Wireless and Telematics (ICWT) (pp 1-4). IEEE
Hu F, Li Z, Yan L (2020) CNN and raspberry PI for fruit tree disease detection. In Intelligent Computing, Information and Control Systems: ICICCS 2019 (pp 1-8). Springer International Publishing
Johnston SJ, Cox SJ (2017) The raspberry Pi: A technology disrupter, and the enabler of dreams. Electronics 6(3):51
Article Google Scholar
McCool C, Marcel S, Hadid A, Pietikäinen M, Matejka P, Cernocký J, ... Cootes T (2012) Bi-modal person recognition on a mobile phone: using mobile phone data. In 2012 IEEE international conference on multimedia and expo workshops (pp 635-640). IEEE
Pal M, Saha G (2015) On robustness of speech based biometric systems against voice conversion attack. Appl Soft Comput 30:214–228
Article Google Scholar
Ramos-Lara R, López-García M, Cantó-Navarro E, Puente-Rodriguez L (2013) Real-time speaker verification system implemented on reconfigurable hardware. J Signal Process Syst 71:89–103
Article Google Scholar
Rani R and Sachdeva R (2016) Genetic algorithm using speech and signature of biometrics. International Research J Eng Tech
Safavi S, Gan H, Mporas I, Sotudeh R (2016) Fraud detection in voice-based identity authentication applications and services. In 2016 IEEE 16th international conference on data mining workshops (ICDMW) (pp 1074-1081). IEEE
Sanderson C, Paliwal KK (2004) Identity verification using speech and face information. Digital Signal Process 14(5):449–480
Article Google Scholar
Sarria-Paja M, Senoussaoui M, Falk TH (2015) The effects of whispered speech on state-of-the-art voice based biometrics systems. In 2015 IEEE 28th Canadian conference on electrical and computer engineering (CCECE) (pp 1254-1259). IEEE
Suri M, Parmar V, Singla A, Malviya R, Nair S (2015) Neuromorphic hardware accelerated adaptive authentication system. In 2015 IEEE Symposium Series on Computational Intelligence (pp 1206-1213). IEEE
Telmem M, Ghanou Y (2021) The convolutional neural networks for Amazigh speech recognition system. TELKOMNIKA (Telecommun Comput Electro Control) 19(2):515–522
Article Google Scholar
Vashistha P, Singh JP, Jain P, Kumar J (2019) Raspberry Pi based voice-operated personal assistant (Neobot). In 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA) (pp 974-978). IEEE
Vázquez-Romero A, Gallardo-Antolín A (2020) Automatic detection of depression in speech using ensemble convolutional neural networks. Entropy 22(6):688
Article ADS PubMed PubMed Central Google Scholar
Yamanoor NS, Yamanoor S (2017) High quality, low cost education with the Raspberry Pi. In 2017 IEEE Global Humanitarian Technology Conference (GHTC) (pp 1-5). IEEE
Yang S, Gong Z, Ye K, Wei Y, Huang Z, Huang Z (2020) EdgeRNN: a compact speech recognition network with spatio-temporal features for edge computing. IEEE Access 8:81468-81478

Download references

Acknowledgements

Authors wish to express their sincere thanks to the SASTRA Deemed University, Thanjavur, India, for extending infrastructural support to carry out this work.

Author information

Authors and Affiliations

Department of ECE/SEEE, SASTRA Deemed University, Thanjavur, India
A. Revathi & N. Raju
Department of CSE/SEEE, SASTRA Deemed University, Thanjavur, India
N. Sasikaladevi

Authors

A. Revathi
View author publications
You can also search for this author inPubMed Google Scholar
N. Sasikaladevi
View author publications
You can also search for this author inPubMed Google Scholar
N. Raju
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to A. Revathi.

Ethics declarations

Ethical approval

This article does not contain any studies being performed with human participants or animals

Conflict of interest

The authors have no relevant conflicts of interest to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

As the authors of the manuscript, we do not have a direct financial relation with the commercial Identity mentioned in our paper that might lead to a conflict of interest for any of the authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Revathi, A., Sasikaladevi, N. & Raju, N. Real time implementation of voice based robust person authentication using T-F features and CNN. Multimed Tools Appl 83, 31587–31601 (2024). https://doi.org/10.1007/s11042-023-16811-x

Download citation

Received: 07 April 2022
Revised: 05 June 2023
Accepted: 31 August 2023
Published: 18 September 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11042-023-16811-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real time implementation of voice based robust person authentication using T-F features and CNN

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Text-Independent Voice Authentication System Using MFCC Features

Voice Identification of Spanish-Speakers Using a Convolution Neural Network in the Audio Interface of a Computer Attack Analysis Tool

Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now