Skip to main content

Speech Activity Detection for Deaf People: Evaluation on the Developed Smart Solution Prototype

  • Chapter
  • First Online:
Intelligent Information and Database Systems: Recent Developments (ACIIDS 2019)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 830))

Included in the following conference series:

  • 843 Accesses

Abstract

This research constitutes a relatively new approach by developing a smart solution which has emerged from the research activity using at first Google Glass and usable speech detection services. The authors conducted in the last year a series of developing, testing and evaluating the prototype results in order to decide, which service provides better results than the third-party speech detection service like Google Speech API or IBM Watson Speech To Text. This finding should significantly help the authors during the data evaluation and testing in developed smart solution. The basic idea is that authors have already developed a functional basic solution—a prototype. This solution was properly working and usable, but there are still some disadvantages to be improved. In order to accomplish the best results possible, the authors have added another element to their solution. A challenging problem which arises in this domain is concerned with significant data savings, server load, detection quality, and again opens a space for further improvements, such as following research and testing. This element is part of the statistical analysis and it is called Hidden Markov Model, which is used for speech recognition applications for last twenty years. The authors examined and studied many different articles and scientific sources in order to find the best solution for higher efficiency of speech recognition usable in their developed prototype (and for this article).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Graf, S. et al.: Features for voice activity detection: a comparative analysis. EURASIP J. Adv. Sign. Process. 1, 91 (2015)

    Google Scholar 

  2. Yanna, M.A., Nishihara. A.: Efficient voice activity detection algorithm using long-term spectral flatness measure. EURASIP J. Audio Speech Music Process. 1, 87 (2013)

    Google Scholar 

  3. Warakagoda, N.D.: A hybrid ANN-HMM ASR system with NN based adaptive preprocessing. May. Web (1996)

    Google Scholar 

  4. Wang, Z., Schultz, T., Waibel, A.: Comparison of acoustic model adaptation techniques on non-native speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003, Proceedings (ICASSP’03), pp. I–I. IEEE (2003)

    Google Scholar 

  5. Shearer, A.E., Hildebrand, M.S., Smith, R.J.H.: Hereditary hearing loss and deafness overview (2017)

    Google Scholar 

  6. Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Sig. Process Lett. 6(1), 1–3 (1999)

    Article  Google Scholar 

  7. Jurafsky, D., Martin, J.H.: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall series in artificial intelligence 1–1024 (2009)

    Google Scholar 

  8. Kyle, J.G., et al.: Sign language: the study of deaf people and their language. Cambridge University Press, Cambridge (1988)

    Google Scholar 

  9. Berger, A., et al.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: International Conference on Mobile Web and Information Systems, pp. 70–82. Springer, Cham (2017)

    Google Scholar 

  10. Berger, A., Maly, F.: Prototype of a smart google glass solution for deaf (and hearing impaired) people. In: International Conference on Mobile Web and Intelligent Information Systems, pp. 38–47. Springer, Cham (2018)

    Google Scholar 

  11. Gandrud, C.: Reproducible research with R and R studio. Chapman and Hall/CRC (2016)

    Google Scholar 

  12. Urbanek, S.: Audio Interface for R. URL: https://cran.r-project.org/package=audio

  13. Ligges, U., et al.: Analysis of Music and Speech. URL: https://cran.r-project.org/package=tuneR

  14. Sueur, J., et al. Sound Analysis and Synthesis. URL: https://cran.r-project.org/package=seewave

  15. Himmelmann, L.: HMM—Hidden Markov Models. URL: https://cran.r-project.org/package=HMM

  16. Zue, V., Seneff, S., Glass, J.: Speech database development at MIT: TIMIT and beyond. Speech Commun. 9(4), 351–356 (1990)

    Article  Google Scholar 

  17. Garofolo, J.S. et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium (1993)

    Google Scholar 

  18. Aalen, O.O., Johansen, S.: An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand. J. Stat. 1, 141–150 (1978)

    MathSciNet  MATH  Google Scholar 

  19. Lou, H.-L.: Implementing the Viterbi algorithm. IEEE Signal Process. Mag. 12(5), 42–52 (1995)

    Article  Google Scholar 

  20. Tatarinov, J., Pollák, P.: Hidden markov models in voice activity detection. In: COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction (2004)

    Google Scholar 

Download references

Acknowledgements

This work and the contribution were supported by the project of Students Grant Agency—FIM, University of Hradec Kralove, Czech Republic. Ales Berger is a student member of the research team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ales Berger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Berger, A., Maly, F. (2020). Speech Activity Detection for Deaf People: Evaluation on the Developed Smart Solution Prototype. In: Huk, M., Maleszka, M., Szczerbicki, E. (eds) Intelligent Information and Database Systems: Recent Developments. ACIIDS 2019. Studies in Computational Intelligence, vol 830. Springer, Cham. https://doi.org/10.1007/978-3-030-14132-5_5

Download citation

Publish with us

Policies and ethics