Speech Activity Detection for Deaf People: Evaluation on the Developed Smart Solution Prototype

Berger, Ales; Maly, Filip

doi:10.1007/978-3-030-14132-5_5

Ales Berger⁵ &
Filip Maly⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 830))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

843 Accesses

Abstract

This research constitutes a relatively new approach by developing a smart solution which has emerged from the research activity using at first Google Glass and usable speech detection services. The authors conducted in the last year a series of developing, testing and evaluating the prototype results in order to decide, which service provides better results than the third-party speech detection service like Google Speech API or IBM Watson Speech To Text. This finding should significantly help the authors during the data evaluation and testing in developed smart solution. The basic idea is that authors have already developed a functional basic solution—a prototype. This solution was properly working and usable, but there are still some disadvantages to be improved. In order to accomplish the best results possible, the authors have added another element to their solution. A challenging problem which arises in this domain is concerned with significant data savings, server load, detection quality, and again opens a space for further improvements, such as following research and testing. This element is part of the statistical analysis and it is called Hidden Markov Model, which is used for speech recognition applications for last twenty years. The authors examined and studied many different articles and scientific sources in order to find the best solution for higher efficiency of speech recognition usable in their developed prototype (and for this article).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Analysis on Commercial and Open Source Software Speech Recognition Technology

Recognition Performance of Selected Speech Recognition APIs – A Longitudinal Study

A Speech Recognition Mechanism for Enabling Interactions Between End-Users and Healthcare Applications

References

Graf, S. et al.: Features for voice activity detection: a comparative analysis. EURASIP J. Adv. Sign. Process. 1, 91 (2015)
Google Scholar
Yanna, M.A., Nishihara. A.: Efficient voice activity detection algorithm using long-term spectral flatness measure. EURASIP J. Audio Speech Music Process. 1, 87 (2013)
Google Scholar
Warakagoda, N.D.: A hybrid ANN-HMM ASR system with NN based adaptive preprocessing. May. Web (1996)
Google Scholar
Wang, Z., Schultz, T., Waibel, A.: Comparison of acoustic model adaptation techniques on non-native speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003, Proceedings (ICASSP’03), pp. I–I. IEEE (2003)
Google Scholar
Shearer, A.E., Hildebrand, M.S., Smith, R.J.H.: Hereditary hearing loss and deafness overview (2017)
Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Sig. Process Lett. 6(1), 1–3 (1999)
Article Google Scholar
Jurafsky, D., Martin, J.H.: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall series in artificial intelligence 1–1024 (2009)
Google Scholar
Kyle, J.G., et al.: Sign language: the study of deaf people and their language. Cambridge University Press, Cambridge (1988)
Google Scholar
Berger, A., et al.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: International Conference on Mobile Web and Information Systems, pp. 70–82. Springer, Cham (2017)
Google Scholar
Berger, A., Maly, F.: Prototype of a smart google glass solution for deaf (and hearing impaired) people. In: International Conference on Mobile Web and Intelligent Information Systems, pp. 38–47. Springer, Cham (2018)
Google Scholar
Gandrud, C.: Reproducible research with R and R studio. Chapman and Hall/CRC (2016)
Google Scholar
Urbanek, S.: Audio Interface for R. URL: https://cran.r-project.org/package=audio
Ligges, U., et al.: Analysis of Music and Speech. URL: https://cran.r-project.org/package=tuneR
Sueur, J., et al. Sound Analysis and Synthesis. URL: https://cran.r-project.org/package=seewave
Himmelmann, L.: HMM—Hidden Markov Models. URL: https://cran.r-project.org/package=HMM
Zue, V., Seneff, S., Glass, J.: Speech database development at MIT: TIMIT and beyond. Speech Commun. 9(4), 351–356 (1990)
Article Google Scholar
Garofolo, J.S. et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium (1993)
Google Scholar
Aalen, O.O., Johansen, S.: An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand. J. Stat. 1, 141–150 (1978)
MathSciNet MATH Google Scholar
Lou, H.-L.: Implementing the Viterbi algorithm. IEEE Signal Process. Mag. 12(5), 42–52 (1995)
Article Google Scholar
Tatarinov, J., Pollák, P.: Hidden markov models in voice activity detection. In: COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction (2004)
Google Scholar

Download references

Acknowledgements

This work and the contribution were supported by the project of Students Grant Agency—FIM, University of Hradec Kralove, Czech Republic. Ales Berger is a student member of the research team.

Author information

Authors and Affiliations

Faculty of Informatics and Management, University of Hradec Kralove, Hradec Kralove, Czech Republic
Ales Berger & Filip Maly

Authors

Ales Berger
View author publications
You can also search for this author in PubMed Google Scholar
Filip Maly
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ales Berger .

Editor information

Editors and Affiliations

Department of Information Systems, Wrocław University of Science and Technology, Wrocław, Poland
Maciej Huk
Department of Information Systems, Wrocław University of Science and Technology, Wrocław, Poland
Marcin Maleszka
Department of Management, Gdańsk University of Technology, Gdańsk, Poland
Edward Szczerbicki

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berger, A., Maly, F. (2020). Speech Activity Detection for Deaf People: Evaluation on the Developed Smart Solution Prototype. In: Huk, M., Maleszka, M., Szczerbicki, E. (eds) Intelligent Information and Database Systems: Recent Developments. ACIIDS 2019. Studies in Computational Intelligence, vol 830. Springer, Cham. https://doi.org/10.1007/978-3-030-14132-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-14132-5_5
Published: 06 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14131-8
Online ISBN: 978-3-030-14132-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics