Skip to main content

Preprocessing of Independent Vector Analysis Using Feed-Forward Network for Robust Speech Recognition

  • Conference paper
Neural Information Processing (ICONIP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7063))

Included in the following conference series:

Abstract

This paper describes an algorithm to preprocess independent vector analysis (IVA) using feed-forward network for robust speech recognition. In the framework of IVA, a feed-forward network is able to be used as an separating system to accomplish successful separation of highly reverberated mixtures. For robust speech recognition, we make use of the cluster-based missing feature reconstruction based on log-spectral features of separated speech in the process of extracting mel-frequency cepstral coefficients. The algorithm identifies corrupted time-frequency segments with low signal-to-noise ratios calculated from the log-spectral features of the separated speech and observed noisy speech. The corrupted segments are filled by employing bounded estimation based on the possibly reliable log-spectral features and on the knowledge of the pre-trained log-spectral feature clusters. Experimental results demonstrate that the proposed method enhances recognition performance in noisy environments significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Juang, B.H.: Speech Recognition in Adverse Environments. Computer Speech & Language 5, 275–294 (1991)

    Article  Google Scholar 

  2. Singh, R., Stern, R.M., Raj, B.: Model Compensation and Matched Condition Methods for Robust Speech Recognition. CRC Press (2002)

    Google Scholar 

  3. Raj, B., Parikh, V., Stern, R.M.: The Effects of Background Music on Speech Recognition Accuracy. In: IEEE ICASSP, pp. 851–854 (1997)

    Google Scholar 

  4. Hyvärinen, A., Harhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons (2001)

    Google Scholar 

  5. Kim, T., Attias, H.T., Lee, S.-Y., Lee, T.-W.: Blind Source Separation Exploiting Higher-Order Frequency Dependencies. IEEE Trans. Audio, Speech, and Language Processing 15, 70–79 (2007)

    Article  Google Scholar 

  6. Kim, L.-H., Tashev, I., Acero, A.: Reverberated Speech Signal Separation Based on Regularized Subband Feedforward ICA and Instantaneous Direction of Arrival. In: IEEE ICASSP, pp. 2678–2681 (2010)

    Google Scholar 

  7. Oh, M., Park, H.-M.: Blind Source Separation Based on Independent Vector Analysis Using Feed-Forward Network. Neurocomputing (in press)

    Google Scholar 

  8. Matsuoka, K., Nakashima, S.: Minimal Distortion Principle for Blind Source Separation. In: International Workshop on ICA and BSS, pp. 722–727 (2001)

    Google Scholar 

  9. Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of Missing Features for Robust Speech Recognition. Speech Comm. 43, 275–296 (2004)

    Article  Google Scholar 

  10. Raj, B., Stern, R.M.: Missing-Feature Methods for Robust Automatic Speech Recognition. IEEE Signal Process. Mag. 22, 101–116 (2005)

    Article  Google Scholar 

  11. Kim, M., Min, J.-S., Park, H.-M.: Robust Speech Recognition Using Missing Feature Theory and Target Speech Enhancement Based on Degenerate Unmixing and Estimation Technique. In: Proc. SPIE 8058 (2011), doi:10.1117/12.883340

    Google Scholar 

  12. Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)

    Google Scholar 

  13. Price, P., Fisher, W.M., Bernstein, J., Pallet, D.S.: The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition. In: Proc. IEEE ICASSP, pp. 651–654 (1988)

    Google Scholar 

  14. Young, S.J., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book (for HTK Version 3.4). University of Cambridge (2006)

    Google Scholar 

  15. Varga, A., Steeneken, H.J.: Assessment for automatic speech recognition: II. In: NOISEX 1992: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Comm., vol. 12, pp. 247–251 (1993)

    Google Scholar 

  16. Allen, J.B., Berkley, D.A.: Image Method for Efficiently Simulating Small-Room Acoustics. Journal of the Acoustical Society of America 65, 943–950 (1979)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oh, M., Park, HM. (2011). Preprocessing of Independent Vector Analysis Using Feed-Forward Network for Robust Speech Recognition. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7063. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24958-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24958-7_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24957-0

  • Online ISBN: 978-3-642-24958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics