Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person

Lee, Seongjae; Kang, Sunmee; Han, David K.; Ko, Hanseok

doi:10.1007/s11517-015-1447-8

Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person

Original Article
Published: 11 January 2016

Volume 54, pages 915–926, (2016)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Seongjae Lee¹,
Sunmee Kang²,
David K. Han³ &
…
Hanseok Ko¹

522 Accesses
7 Citations
Explore all metrics

Abstract

A novel approach for assisting bidirectional communication between people of normal hearing and hearing-impaired is presented. While the existing hearing-impaired assistive devices such as hearing aids and cochlear implants are vulnerable in extreme noise conditions or post-surgery side effects, the proposed concept is an alternative approach wherein spoken dialogue is achieved by means of employing a robust speech recognition technique which takes into consideration of noisy environmental factors without any attachment into human body. The proposed system is a portable device with an acoustic beamformer for directional noise reduction and capable of performing speech-to-text transcription function, which adopts a keyword spotting method. It is also equipped with an optimized user interface for hearing-impaired people, rendering intuitive and natural device usage with diverse domain contexts. The relevant experimental results confirm that the proposed interface design is feasible for realizing an effective and efficient intelligent agent for hearing-impaired.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Therapy Supported by AI and Smart Assistants

Review of Toolkit to Build Automatic Speech Recognition Models

The Development Trend of Intelligent Speech Interaction

References

Ara V, Nefian AV, Monson H (1999) An embedded HMM-based approach for face detection and recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Phoenix, AZ, USA, pp 3553–3556
Beh J, Baran RH, Ko H (2006) Dual channel based speech enhancement using novelty filter for robust speech recognition in automobile environments. In: Proceedings of IEEE international conference on consumer electronics, Las Vegas, NV, USA, pp 243–244
Benesty J, Chen J, Huang Y, Dmochowski J (2007) On microphone-array beamforming from a MIMO acoustic signal processing perspective. IEEE Trans Audio Speech Lang Process 15(3):1053–1065
Article Google Scholar
Chen Y, Hou T, Meng S, Zhong S, Liu J (2006) A new framework for large vocabulary keyword spotting using two-pass confidence measure. In: Proceedings of conference on computational engineering in systems applications (IMACS), Beijing, China, pp 68–71
Cohen I (2004) Multichannel post-filtering in nonstationary noise environments. IEEE Trans Signal Process 52(5):1149–1160
Article Google Scholar
Cohen I (2005) Relaxed statistical model for speech enhancement and a priori SNR estimation. IEEE Trans Speech Audio Process 13(5):870–881
Article Google Scholar
Cohen I, Berdugo B (2001) Speech enhancement for non-stationary noise environments. Elsevier Signal Process 81(11):2403–2418
Article Google Scholar
Davenport J, Schwartz R, Nguyen L (1999) Towards a robust real-time decoder. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Phoenix, AZ, USA, pp 645–648
Engwall O, Bälter O, Öster AM, Kjellström H (2007) Designing the user interface of the computer-based speech training system ARTUR based on early user tests. J Behav Inf Technol 25(4):353–365
Article Google Scholar
Ephraim Y, Van Trees HL (1995) A signal subspace approach for speech enhancement. IEEE Trans Speech Audio Process 3(4):251–266
Article Google Scholar
Gannot S (2001) Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Trans Signal Process 49(8):1614–1626
Article Google Scholar
Gannot S, Cohen I (2001) Speech enhancement based on the general transfer function GSC and Postfiltering. IEEE Trans Speech Audio Process 12(6):561–571
Article Google Scholar
Habets E, Benesty J (2011) Joint dereverberation and noise reduction using a two-stage beamforming approach. In Proceedings of joint workshop on hands-free speech communication and microphone arrays, Edinburgh, UK, pp 191–195
Horvitz E (1999) Principles of mixed-initiative user interfaces. In: Proceedings of ACM SIGCHI conference on human factors in computing systems, Pittsburgh, PA, USA, pp 159–166
Hu Y, Loizou P (2004) Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Process Lett 11(2):270–273
Article CAS Google Scholar
Huggins-Daines D, Kumar M, Chan A, Black AW, Ravishankar M, Rudnicky AI (2006) PocketSphinx: a free real-time continuous speech recognition system for hand-held devices. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Toulouse, France, pp 185–188
Humes LE, Wilson DL, Humes AC (2003) Examination of differences between successful and unsuccessful elderly hearing aid candidates matched for age, hearing loss and gender. Int J Audiol 42(7):432–441
Article PubMed Google Scholar
Jeong S, Min K, Ko H (2006) Fast decoder design of connected word speech recognition for automobile navigation system. In: Proceedings of IEEE international conference on consumer electronics, Las Vegas, NV, USA, pp 215–216
Karpov A, Ronzhin A, Kipyatkova I (2011) An assistive bi-modal user interface integrating multi-channel speech recognition and computer vision. In: Proceedings of 14th international conference on human-computer interaction HCI international, Orlando, FL, USA, pp 454–463
Kim N, Chang J (2000) Spectral enhancement based on global soft decision. IEEE Signal Process Lett 7(5):108–110
Article Google Scholar
Kosmidou VE, Hadjileontiadis LI (2010) Using sample entropy for automated sign language recognition on sEMG and accelerometer data. Med Biol Eng Comput 48(3):255–267
Article PubMed Google Scholar
Lee W, Song J, Chang J (2011) Minima-controlled speech presence uncertainty tracking method for speech enhancement. Signal Process 91(1):155–161
Article Google Scholar
Lund AM (2001) Measuring usability with the USE questionnaire. STC usability. SIG Newsl 8(2)
Luo X, Han M, Liu T, Chen W, Bai F (2012) Assistive learning for hearing impaired college students using mixed reality: a pilot study. In: Proceedings of IEEE international conference on virtual reality and visualization (ICVRV), Qinhuangdao, China, pp 74–81
Markovich S, Gannot S, Cohen I (2009) Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Trans Audio Speech Lang Process 17(6):1071–1086
Article Google Scholar
Martin R (1994) Spectral subtraction based on minimum statistics. In: Proceedings of European signal processing conference (EUSIPCO), Edinburgh, UK, pp 1182–1185
Novuk M, Humpl R, Krbec P, Bergl V, Sedivy J (2003) Two-pass search strategy for large list recognition on embedded speech recognition platforms. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), Hong Kong, China, pp 200–203
Park J, Ko H (2008) Real-time continuous phoneme recognition system using class-dependent tied-mixture HMM with HBT structure for speech-driven lip-synchronization. IEEE Trans Multimedia 10(7):1299–1306
Article Google Scholar
Pulli P, Hyry J, Pouke M, Yamamoto G (2012) User interaction in smart ambient environment targeted for senior citizen. Med Biol Eng Comput 50(11):1119–1126
Article PubMed Google Scholar
Schuster J, Gupta K, Hoare R, Jones A (2006) Speech silicon: an FPGA architecture for real-time, hidden markov model based speech recognition. EURASIP J Embed Syst 2(1):1–19
Article Google Scholar
Song J, Yang S (2009) Design of communication system for the hearing impaired. J Korean Soc Des Sci 22(1):197–206
Google Scholar
Sorri M, Luotonen M, Laitakari K (1984) Use and non-use of hearing aids. Br J Audiol 18(3):169–172
Article CAS PubMed Google Scholar
Suresh P, Vasudevan N, Ananthanarayanan N (2012) Computer-aided interpreter for hearing and speech impaired. In: Proceedings of 4th IEEE international conference on computational intelligence, communication systems and networks (CICSyN), Phuket, Thailand, pp 248–253
Szöke I, Schwarz P, Matějka P, Karafiát M (2005) Comparison of keyword spotting approaches for informal continuous speech. In: Proceedings of 9th European conference on speech communication and technology (Interspeech), Lisbon, Portugal, pp 633–636
Wang X, Han Z, Wang J, Guo M (2008) Speech recognition system based on visual feature for the hearing impaired. In: Proceedings of 4th international conference on natural computation (ICNC), Jinan, China, pp 543–546
Yoo I, Yook D (2008) Automatic sound recognition for the hearing impaired. IEEE Trans Consum Electron 54(4):2029–2036
Article Google Scholar
Zeng G, Rebscher S, Harrison WV, Sun X, Feng H (2008) Cochlear implants: system design, integration, and evaluation. A clinical application review. IEEE Rev Biomed Eng 1:115–142
Article PubMed PubMed Central Google Scholar
Zhao Y, Zhang X, Hu R, Xue J, Li X, Che L, Hu R, Schopp L (2006) An automatic captioning system for telemedicine. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), Toulouse, France, pp 957–960

Download references

Acknowledgments

This research was supported by Ministry of Health & Welfare R&D (A111189) and partially supported by Seokyeong University grant programme in 2013.

Author information

Authors and Affiliations

Department of Visual Information Processing, Korea University, Seoul, Korea
Seongjae Lee & Hanseok Ko
Department of Electrical Engineering, Seokyeong University, Seoul, Korea
Sunmee Kang
Office of Naval Research, Arlington, VA, USA
David K. Han

Authors

Seongjae Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sunmee Kang
View author publications
You can also search for this author in PubMed Google Scholar
David K. Han
View author publications
You can also search for this author in PubMed Google Scholar
Hanseok Ko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sunmee Kang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S., Kang, S., Han, D.K. et al. Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person. Med Biol Eng Comput 54, 915–926 (2016). https://doi.org/10.1007/s11517-015-1447-8

Download citation

Received: 24 March 2014
Accepted: 19 December 2015
Published: 11 January 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s11517-015-1447-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person

Abstract

Access this article

Similar content being viewed by others

Speech Therapy Supported by AI and Smart Assistants

Review of Toolkit to Build Automatic Speech Recognition Models

The Development Trend of Intelligent Speech Interaction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person

Abstract

Access this article

Similar content being viewed by others

Speech Therapy Supported by AI and Smart Assistants

Review of Toolkit to Build Automatic Speech Recognition Models

The Development Trend of Intelligent Speech Interaction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation