Skip to main content

Preprocessing of Dysarthric Speech in Noise Based on CV–Dependent Wiener Filtering

  • Conference paper
  • First Online:
Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop

Abstract

In this paper, we propose a consonant–vowel (CV) dependentWiener filter for dysarthric automatic speech recognition (ASR) in noisy environments. When a Wiener filter is applied to dysarthric speech in noise, it distorts initial consonants of dysarthric speech. This is because compared to normal speech, the speech spectrum at a consonant-vowel onset in dysarthric speech is much similar to that of noise, thus speech at the onset is easy to be removed by the Wiener filtering. In order to mitigate this problem, the transfer function of a Wiener filter is differently constructed depending on the result of CV classification that is performed by combining voice activity detection (VAD) and vowel onset estimation. In this work, VAD is done by a statistical model based approach and the vowel onset estimation is by investigating the variation of linear prediction residual signals. To demonstrate the effectiveness of the proposed CV–dependentWiener filter on the performance of dysarthric ASR, we compare the performance of an ASR system employing the proposed method with that using a conventional Wiener filter for different groups of degrees of disability under different signal–to–noise ratio conditions. Consequently, it is shown from the ASR experiments that the proposed Wiener filter achieves a relative average word error rate reduction of 10.41%, 6.03%, and 0.94% for the mild, moderate, and severe group of disability, respectively, when compared to the conventional Wiener filter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Haines D (2004) Neuroanatomy: an Atlas of Structures, Sections, and Systems. Lippincott Williams and Wilkins, Hagerstown

    Google Scholar 

  2. Platt LJ, Andrews G, Young M, Quinn PT (1980) Dysarthria of adult cerebral palsy: I. Intelligibility and articulatory impairment. Journal of Speech and Hearing Research 23(1):28–40

    Google Scholar 

  3. Hasegawa–Johnson M, Gunderson J, Perlman A, Huang T (2006) HMM– based and SVM–based recognition of the speech of talkers with spastic dysarthria. in Proc. of International Conference on Acoustics, Speech, and Signal Processing 1:1060–1063

    Google Scholar 

  4. Parker M, Cunningham S, Enderby P, Hawley, M, Green P (2006) Automatic speech recognition and training for severely dysarthric users of assistive technology: the STARDUST project. Clinical Linguistics and Phonetics 20(2/3):149–156

    Google Scholar 

  5. Benesty J, Makino S, Chen J (2005) Speech Enhancement. Springer, Berlin [6] Erkelens JS, Heusdens R (2008) Tracking of nonstationary noise based on data–driven recursive noise power estimation. IEEE Trans. on Audio, Speech, and Language Processing 16(6):1112–1123

    Google Scholar 

  6. Kent RD, Rosenbek JC (1983) Acoustic patterns of apraxia of speech. Journal of Speech and Hearing Research 26(2):231–249

    Google Scholar 

  7. Platt LJ, Andrews G, Howie PM (1980) Dysarthria of adult cerebral palsy: II. Phonemic analysis of articulation errors. Journal of Speech and Hearing Research 23(1):41–55

    Google Scholar 

  8. Sohn J, Kim NS, Sung W (1999) A statistical model based voice activity detection. IEEE Signal Processing Letters 6(1):1–3

    Article  Google Scholar 

  9. Prasanna SR, Reddy BV, Krishnamoorthy P (2009) Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Trans. on Audio, Speech, and Language Processing 17(4):556–565

    Google Scholar 

  10. Kim S, Oh S, Jung HY, Jeong HB, Kim JS (2002) Common speech database collection. in Proc. Acoustical Society of Korea 21(1):21–24

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the R&D Program of MKE/KEIT (10036461, Development of an embedded key-word spotting speech recognition system individually customized for disabled persons with dysarthria) and the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology (No.2010-0023888).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji Hun Park .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this paper

Cite this paper

Park, J.H., Seong, W.K., Kim, H.K. (2011). Preprocessing of Dysarthric Speech in Noise Based on CV–Dependent Wiener Filtering. In: Delgado, RC., Kobayashi, T. (eds) Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1335-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1335-6_6

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1334-9

  • Online ISBN: 978-1-4614-1335-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics