Method of Speech Signal Structuring and Transforming for Biometric Personality Identification

Fedorov, Eugene; Utkina, Tetyana; Nechyporenko, Olga; Korpan, Yaroslav

doi:10.1007/978-3-030-61656-4_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1158))

Included in the following conference series:

International Conference on Data Stream Mining and Processing

401 Accesses
1 Citations

Abstract

This paper proposes a method for structuring and transforming a speech signal. For this, segmentation method, methods for determining the fundamental tone of the vocal segment and determining on its basis the boundaries of the quasiperiodic oscillations of the vocal segment, the geometric transformation of quasiperiodic oscillations of the vocal segment were suggested. The proposed segmentation of the speech signal uses statistical estimation of short-term energies, which allows the use of an adaptive threshold, thus increasing the vocal segments determination accuracy. The proposed definition of fundamental tone of the vocal segment uses bandpass filtering and statistical estimation of local extremum, which reduces computational complexity, and also reduces noise dependency and allows the use of an adaptive threshold, thus increasing the accuracy of determining the fundamental tone and the boundaries of quasiperiodic oscillations of the vocal segment. The proposed geometric transformation of quasiperiodic oscillations of the vocal segment allows you to transform quasiperiodic oscillations to a single amplitude-time window, which allows you to form patterns of the vocal segment, taking into account its structure. A method for determining a model structure for transforming speech signal patterns is proposed, which is based on a statistical evaluation of the quality of the transforming, which provides a high degree of compression and the speech signal identification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bartlett, M., Movellan, J., Sejnowski, T.: Face recognition by independent component analysis. IEEE Trans. Neural Netw. 13(6), 1450–1464 (2002). https://doi.org/10.1109/TNN.2002.804287
Article Google Scholar
Beigi, H.: Fundamentals of Speaker Recognition. Springer, New York (2011). https://doi.org/10.1007/978-0-387-77592-0
Book MATH Google Scholar
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997). https://doi.org/10.1109/34.598228
Article Google Scholar
Bolle, R., Connell, J., Pankanti, S., Ratha, N., Senior, A.: Guide to Biometrics. Springer, New York (2004). https://doi.org/10.1007/978-1-4757-4036-3
Book Google Scholar
Campbell, J.: Speaker recognition: a tutorial. IEEE 85, 1437–1462 (1997). https://doi.org/10.1109/5.628714
Article Google Scholar
Chauhan, V., Dwivedi, S., Karale, P., Potdar, S.: Speech to text converter using gaussian mixture model (gmm). Int. Res. J. Eng. Technol. (IRJET) 3, 160–164 (2016)
Google Scholar
Draper, B., Baek, K., Bartlett, M., Beveridge, J.: Recognizing faces with PCA and ICA. Comput. Vis. Image Understand. (Special Issue Face Recognit.) 91(1–2), 115–137 (2003). https://doi.org/10.1016/S1077-3142(03)00077-8
Article Google Scholar
Dunstone, T., Yager, N.: Biometric System and Data Analysis Design, Evaluation, and Data Mining. Springer, New York (2009). https://doi.org/10.1007/978-0-387-77627-9
Book Google Scholar
Fedorov, E., Lukashenko, V., Utkina, T., Lukashenko, A., Rudakov, K.: Method for parametric identification of gaussian mixture model based on clonal selection algorithm. In: CEUR Workshop Proceedings, vol. 2353, pp. 41–55 (2019). https://doi.org/10.15588/1607-3274-2019-2-10
Larin, V.J., Fedorov, E.E.: Combination of PNN network and DTW method for identification of reserved words, used in aviation during radio negotiation. Radioelectron. Commun. Syst. 57(8), 362–368 (2014). https://doi.org/10.3103/S0735272714080044
Article Google Scholar
He, J., Zhang, D.: Face recognition using uniform eigen-space svd on enhanced image for single training sample. J. Comput. Inf. Syst. 7(5), 1655–1662 (2011)
Google Scholar
Herbig, T., Gerl, F., Minker, W.: Self-Learning Speaker Identification a System for Enhanced Speech Recognition. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19899-1
Book Google Scholar
Jain, A., Flynn, P., Ross, A.: Handbook of Biometrics. Springer, New York (2008). https://doi.org/10.1007/978-0-387-71041-9
Book Google Scholar
Jeyalakshmi, C., Krishnamurthi, V., Revathi, A.: Speech recognition of deaf and hard of hearing people using hybrid neural network. In: Mechanical and Electronic Engineering (ICMEE 2010), vol. 1, pp. 83–87. (2010). https://doi.org/10.1109/ICMEE.2010.5558589
Keshet, J., Bengio, S.: Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley, Chichester, West Sussex (2009)
Book Google Scholar
Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Elsevier Speech Commun. 52, 12–40 (2010). https://doi.org/10.1016/j.specom.2009.08.009
Article Google Scholar
Li, Q.: Speaker Authentication. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-23731-7
Book MATH Google Scholar
Markel, J., Gray, A.: Linear Prediction of Speech. Springer, Berlin (1976). https://doi.org/10.1007/978-3-642-66286-7
Book MATH Google Scholar
Nayana, P., Mathew, D., Thomas, A.: Comparison of text independent speaker identification systems using gmm and i-vector methods. Proc. Comput. Sci. 115, 47–54 (2017). https://doi.org/10.1016/j.procs.2017.09.075
Article Google Scholar
Rabiner, L., Jang, B.: Fundamentals of Speech Recognition. Prentice Hall PTR, Englewood Cliffs (1993)
Google Scholar
Reynolds, D.: Automatic speaker recognition using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 1738–1752 (1995)
Article Google Scholar
Reynolds, D.: An overview of automatic speaker recognition technology. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, pp. 4072–4075 (2002)
Google Scholar
Reynolds, D., Rose, R.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72–83 (1995)
Article Google Scholar
Singh, N., Khan, R., Shree, R.: Applications of speaker recognition. Proc. Eng. 38, 3122–3126 (2012). https://doi.org/10.1016/j.proeng.2012.06.363
Article Google Scholar
Togneri, R., Pullela, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circ. Syst. Mag. 11, 23–61 (2011). https://doi.org/10.1109/MCAS.2011.941079
Article Google Scholar
Zeng, F.Z., Zhou, H.: Speaker recognition based on a novel hybrid algorithm. Proc. Eng. 61, 220–226 (2013). https://doi.org/10.1016/j.proeng.2013.08.007
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cherkasy State Technological University, Cherkasy, Ukraine
Eugene Fedorov, Tetyana Utkina, Olga Nechyporenko & Yaroslav Korpan

Authors

Eugene Fedorov
View author publications
You can also search for this author in PubMed Google Scholar
Tetyana Utkina
View author publications
You can also search for this author in PubMed Google Scholar
Olga Nechyporenko
View author publications
You can also search for this author in PubMed Google Scholar
Yaroslav Korpan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eugene Fedorov .

Editor information

Editors and Affiliations

Department of Informatics, Univerzita Jana Evangelisty Purkyně v Ústí nad Labem, Ústí nad Labem, Czech Republic
Sergii Babichev
GeoGuard, Kharkiv, Ukraine
Dmytro Peleshko
Kharkiv National University of Radio Electronics, Kharkiv, Ukraine
Olena Vynokurova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fedorov, E., Utkina, T., Nechyporenko, O., Korpan, Y. (2020). Method of Speech Signal Structuring and Transforming for Biometric Personality Identification. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds) Data Stream Mining & Processing. DSMP 2020. Communications in Computer and Information Science, vol 1158. Springer, Cham. https://doi.org/10.1007/978-3-030-61656-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-61656-4_20
Published: 05 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61655-7
Online ISBN: 978-3-030-61656-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Method of Speech Signal Structuring and Transforming for Biometric Personality Identification