Skip to main content

Advertisement

Log in

Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

A novel approach for assisting bidirectional communication between people of normal hearing and hearing-impaired is presented. While the existing hearing-impaired assistive devices such as hearing aids and cochlear implants are vulnerable in extreme noise conditions or post-surgery side effects, the proposed concept is an alternative approach wherein spoken dialogue is achieved by means of employing a robust speech recognition technique which takes into consideration of noisy environmental factors without any attachment into human body. The proposed system is a portable device with an acoustic beamformer for directional noise reduction and capable of performing speech-to-text transcription function, which adopts a keyword spotting method. It is also equipped with an optimized user interface for hearing-impaired people, rendering intuitive and natural device usage with diverse domain contexts. The relevant experimental results confirm that the proposed interface design is feasible for realizing an effective and efficient intelligent agent for hearing-impaired.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ara V, Nefian AV, Monson H (1999) An embedded HMM-based approach for face detection and recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Phoenix, AZ, USA, pp 3553–3556

  2. Beh J, Baran RH, Ko H (2006) Dual channel based speech enhancement using novelty filter for robust speech recognition in automobile environments. In: Proceedings of IEEE international conference on consumer electronics, Las Vegas, NV, USA, pp 243–244

  3. Benesty J, Chen J, Huang Y, Dmochowski J (2007) On microphone-array beamforming from a MIMO acoustic signal processing perspective. IEEE Trans Audio Speech Lang Process 15(3):1053–1065

    Article  Google Scholar 

  4. Chen Y, Hou T, Meng S, Zhong S, Liu J (2006) A new framework for large vocabulary keyword spotting using two-pass confidence measure. In: Proceedings of conference on computational engineering in systems applications (IMACS), Beijing, China, pp 68–71

  5. Cohen I (2004) Multichannel post-filtering in nonstationary noise environments. IEEE Trans Signal Process 52(5):1149–1160

    Article  Google Scholar 

  6. Cohen I (2005) Relaxed statistical model for speech enhancement and a priori SNR estimation. IEEE Trans Speech Audio Process 13(5):870–881

    Article  Google Scholar 

  7. Cohen I, Berdugo B (2001) Speech enhancement for non-stationary noise environments. Elsevier Signal Process 81(11):2403–2418

    Article  Google Scholar 

  8. Davenport J, Schwartz R, Nguyen L (1999) Towards a robust real-time decoder. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Phoenix, AZ, USA, pp 645–648

  9. Engwall O, Bälter O, Öster AM, Kjellström H (2007) Designing the user interface of the computer-based speech training system ARTUR based on early user tests. J Behav Inf Technol 25(4):353–365

    Article  Google Scholar 

  10. Ephraim Y, Van Trees HL (1995) A signal subspace approach for speech enhancement. IEEE Trans Speech Audio Process 3(4):251–266

    Article  Google Scholar 

  11. Gannot S (2001) Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Trans Signal Process 49(8):1614–1626

    Article  Google Scholar 

  12. Gannot S, Cohen I (2001) Speech enhancement based on the general transfer function GSC and Postfiltering. IEEE Trans Speech Audio Process 12(6):561–571

    Article  Google Scholar 

  13. Habets E, Benesty J (2011) Joint dereverberation and noise reduction using a two-stage beamforming approach. In Proceedings of joint workshop on hands-free speech communication and microphone arrays, Edinburgh, UK, pp 191–195

  14. Horvitz E (1999) Principles of mixed-initiative user interfaces. In: Proceedings of ACM SIGCHI conference on human factors in computing systems, Pittsburgh, PA, USA, pp 159–166

  15. Hu Y, Loizou P (2004) Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Process Lett 11(2):270–273

    Article  CAS  Google Scholar 

  16. Huggins-Daines D, Kumar M, Chan A, Black AW, Ravishankar M, Rudnicky AI (2006) PocketSphinx: a free real-time continuous speech recognition system for hand-held devices. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Toulouse, France, pp 185–188

  17. Humes LE, Wilson DL, Humes AC (2003) Examination of differences between successful and unsuccessful elderly hearing aid candidates matched for age, hearing loss and gender. Int J Audiol 42(7):432–441

    Article  PubMed  Google Scholar 

  18. Jeong S, Min K, Ko H (2006) Fast decoder design of connected word speech recognition for automobile navigation system. In: Proceedings of IEEE international conference on consumer electronics, Las Vegas, NV, USA, pp 215–216

  19. Karpov A, Ronzhin A, Kipyatkova I (2011) An assistive bi-modal user interface integrating multi-channel speech recognition and computer vision. In: Proceedings of 14th international conference on human-computer interaction HCI international, Orlando, FL, USA, pp 454–463

  20. Kim N, Chang J (2000) Spectral enhancement based on global soft decision. IEEE Signal Process Lett 7(5):108–110

    Article  Google Scholar 

  21. Kosmidou VE, Hadjileontiadis LI (2010) Using sample entropy for automated sign language recognition on sEMG and accelerometer data. Med Biol Eng Comput 48(3):255–267

    Article  PubMed  Google Scholar 

  22. Lee W, Song J, Chang J (2011) Minima-controlled speech presence uncertainty tracking method for speech enhancement. Signal Process 91(1):155–161

    Article  Google Scholar 

  23. Lund AM (2001) Measuring usability with the USE questionnaire. STC usability. SIG Newsl 8(2)

  24. Luo X, Han M, Liu T, Chen W, Bai F (2012) Assistive learning for hearing impaired college students using mixed reality: a pilot study. In: Proceedings of IEEE international conference on virtual reality and visualization (ICVRV), Qinhuangdao, China, pp 74–81

  25. Markovich S, Gannot S, Cohen I (2009) Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Trans Audio Speech Lang Process 17(6):1071–1086

    Article  Google Scholar 

  26. Martin R (1994) Spectral subtraction based on minimum statistics. In: Proceedings of European signal processing conference (EUSIPCO), Edinburgh, UK, pp 1182–1185

  27. Novuk M, Humpl R, Krbec P, Bergl V, Sedivy J (2003) Two-pass search strategy for large list recognition on embedded speech recognition platforms. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Hong Kong, China, pp 200–203

  28. Park J, Ko H (2008) Real-time continuous phoneme recognition system using class-dependent tied-mixture HMM with HBT structure for speech-driven lip-synchronization. IEEE Trans Multimedia 10(7):1299–1306

    Article  Google Scholar 

  29. Pulli P, Hyry J, Pouke M, Yamamoto G (2012) User interaction in smart ambient environment targeted for senior citizen. Med Biol Eng Comput 50(11):1119–1126

    Article  PubMed  Google Scholar 

  30. Schuster J, Gupta K, Hoare R, Jones A (2006) Speech silicon: an FPGA architecture for real-time, hidden markov model based speech recognition. EURASIP J Embed Syst 2(1):1–19

    Article  Google Scholar 

  31. Song J, Yang S (2009) Design of communication system for the hearing impaired. J Korean Soc Des Sci 22(1):197–206

    Google Scholar 

  32. Sorri M, Luotonen M, Laitakari K (1984) Use and non-use of hearing aids. Br J Audiol 18(3):169–172

    Article  CAS  PubMed  Google Scholar 

  33. Suresh P, Vasudevan N, Ananthanarayanan N (2012) Computer-aided interpreter for hearing and speech impaired. In: Proceedings of 4th IEEE international conference on computational intelligence, communication systems and networks (CICSyN), Phuket, Thailand, pp 248–253

  34. Szöke I, Schwarz P, Matějka P, Karafiát M (2005) Comparison of keyword spotting approaches for informal continuous speech. In: Proceedings of 9th European conference on speech communication and technology (Interspeech), Lisbon, Portugal, pp 633–636

  35. Wang X, Han Z, Wang J, Guo M (2008) Speech recognition system based on visual feature for the hearing impaired. In: Proceedings of 4th international conference on natural computation (ICNC), Jinan, China, pp 543–546

  36. Yoo I, Yook D (2008) Automatic sound recognition for the hearing impaired. IEEE Trans Consum Electron 54(4):2029–2036

    Article  Google Scholar 

  37. Zeng G, Rebscher S, Harrison WV, Sun X, Feng H (2008) Cochlear implants: system design, integration, and evaluation. A clinical application review. IEEE Rev Biomed Eng 1:115–142

    Article  PubMed  PubMed Central  Google Scholar 

  38. Zhao Y, Zhang X, Hu R, Xue J, Li X, Che L, Hu R, Schopp L (2006) An automatic captioning system for telemedicine. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), Toulouse, France, pp 957–960

Download references

Acknowledgments

This research was supported by Ministry of Health & Welfare R&D (A111189) and partially supported by Seokyeong University grant programme in 2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunmee Kang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, S., Kang, S., Han, D.K. et al. Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person. Med Biol Eng Comput 54, 915–926 (2016). https://doi.org/10.1007/s11517-015-1447-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-015-1447-8

Keywords

Navigation