Abstract
During the last few years, many attempts were accomplished in the field of human sound and speech processing aiming to build speakers identification systems. The basic views of these systems were different, but the accuracy of the final computer process result for the identification depended on varieties of factors. For the intent of human sound extractions several methods in both the time domain and the frequency domain are used. Popular Linear Prediction Encoding (LPC) is used to parameterize voices, and to be used later in voiced/unvoiced separate extraction method functions. In comparison, direct classical methods are used for the extraction of human sound characteristics. Human voices are so much time varying that one recorded voice signal of a short time can never convey to distinguish speaker identification almost (100%).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nagrani, A., Chung, J.S., Xie, W., Zisserman, A.: Voxceleb: large-scale speaker verification in the wild. Comput. Speech Lang. 60, 101027 (2020)
Bachu, R.G., et al.: Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In: American Society for Engineering Education (ASEE) Zone Conference Proceedings (2008)
Antoniou, A.: Digital Signal Processing. McGraw-Hill, New York (2016)
Childers, D.G.: Speech Processing and Synthesis Toolboxes. Tsinghua University Press, Beijing (2004)
Graves, A., Abdel-Rahman, M., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE (2013)
Eray, O., Tokat, S., Iplikci, S.: An application of speech recognition with support vector machines. In: IEEE 2018 6th International Symposium on Digital Forensic and Security (ISDFS) (2018)
Jukic, A., van Waterschoot, T., Gerkmann, T., Doclo, S.: Multi-channel linear prediction-based speech dereverberation with sparse priors. IEEE/ACM Trans. Audio, Speech Lang. Process. 23(9), 1509–1520 (2015). https://doi.org/10.1109/TASLP.2015.2438549
Subramanian, A.S., Wang, X., Baskar, M.K., Watanabe, S., Taniguchi, T., Tran, D., Fujita, Y.: Speech enhancement using end-to-end speech recognition objectives. In 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 234–238. IEEE, October 2019
Juvela, L., Bollepalli, B., Yamagishi, J., Alku, P.: GELP: GAN-excited linear prediction for speech synthesis from mel-spectrogram (2019). arXiv preprint arXiv:1904.03976
Singhal, S., Passricha, V., Sharma, P., Aggarwal, R.K.: Multi-level region-of-interest CNNs for end to end speech recognition. J. Amb. Intell. Hum. Comput. 10(11), 4615–4624 (2019). https://doi.org/10.1007/s12652-018-1146-z
Wayman, J., Jain, A., Maltoni, D., Maio, D.: An introduction to biometric authentication systems. In: Wayman, J., Jain, A., Maltoni, D., Maio, D. (eds.) Biometric Systems, pp. 1–20. Springer-Verlag, London (2005). https://doi.org/10.1007/1-84628-064-8_1
Price, M., Glass, J., Chandrakasan, A.: A low-power speech recognizer and voice activity detector using deep neural networks. IEEE J. Solid-State Circ. 53(1), 66–75 (2018)
Sharifi, M., Moreno, I.L., Schmidt, L.: Speaker identification. U.S. Patent 10,565,996, issued February 18, 2020
Al-Shamma, O., Fadhel, M.A., Hasan, H.S.: Employing FPGA accelerator in real-time speaker identification systems. In: Recent Trends in Signal and Image Processing, pp. 125–134. Springer, Singapore (2019)
Alzubaidi, L., Fadhel, M.A., Al-Shamma, O., Zhang, J., Duan, Y.: Deep learning models for classification of red blood cells in microscopy images to aid in sickle cell anemia diagnosis. Electronics 9(3), 427 (2020)
Hasan, R.I., Yusuf, S.M., Alzubaidi, L.: Review of the state of the art of deep learning for plant diseases: a broad analysis and discussion. Plants 9(10), 1302 (2020)
Alzubaidi, L., Al-Shamma, O., Fadhel, M.A., Farhan, L., Zhang, J., Duan, Y.: Optimizing the performance of breast cancer classification by employing the same domain transfer learning from hybrid deep convolutional neural network model. Electronics 9(3), 445 (2020)
Alzubaidi, L., Fadhel, M.A., Al-Shamma, O., Zhang, J., SantamarÃa, J., Duan, Y., Oleiwi, S.R.: Towards a better understanding of transfer learning for medical imaging: a case study. Appl. Sci. 10(13), 4523 (2020)
Al-Shamma, O., Fadhel, M.A., Hameed, R.A., Alzubaidi, L., Zhang, J.: Boosting convolutional neural networks performance based on FPGA accelerator, December 2018
Fadhel, M.A., Al-Shamma, O., Oleiwi, S.R., Taher, B.H., Alzubaidi, L.: Real-time PCG diagnosis using FPGA. In: International Conference on Intelligent Systems Design and Applications, pp. 518–529. Springer, Cham, December 2018
Alzubaidi, L., Zhang, J., Humaidi, A.J., et al.: Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8, 53 (2021). https://doi.org/10.1186/s40537-021-00444-8
Alzubaidi, L., Al-Amidie, M., Al-Asadi, A., Humaidi, A.J., Al-Shamma, O., Fadhel, M.A., Zhang, J., SantamarÃa, J., Duan, Y.: Novel Transfer LearningApproach for Medical Imaging with Limited Labeled Data. Cancers 13, 1590 (2021). https://doi.org/10.3390/cancers13071590
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hatem, A.S., Adulredhi, M.J., Abdulrahman, A.M., Fadhel, M.A. (2021). Human Speaker Recognition Based Database Method. In: Abraham, A., Piuri, V., Gandhi, N., Siarry, P., Kaklauskas, A., Madureira, A. (eds) Intelligent Systems Design and Applications. ISDA 2020. Advances in Intelligent Systems and Computing, vol 1351. Springer, Cham. https://doi.org/10.1007/978-3-030-71187-0_106
Download citation
DOI: https://doi.org/10.1007/978-3-030-71187-0_106
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71186-3
Online ISBN: 978-3-030-71187-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)