Skip to main content

Microphone Array Beamforming Approach to Blind Speech Separation

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4892))

Included in the following conference series:

Abstract

In this paper, we present a microphone array beamforming approach to blind speech separation. Unlike previous beamforming approaches, our system does not require a-priori knowledge of the microphone placement and speaker location, making the system directly comparable other blind source separation methods which require no prior knowledge of recording conditions. Microphone location is automatically estimated using an assumed noise field model, and speaker locations are estimated using cross correlation based methods. The system is evaluated on the data provided for the PASCAL Speech Separation Challenge 2 (SSC2), achieving a word error rate of 58% on the evaluation set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Janin, A., et al.: The ICSI-SRI Spring 2006 Meeting Recognition System. In: Proc. of the Rich Transcription 2006 Spring Meeting Recognition Evaluation, Washington, USA (2006)

    Google Scholar 

  2. Hain, T., et al.: The AMI system for the transcription of speech in meetings. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 4, pp. 357–360 (2007)

    Google Scholar 

  3. Morgan, N., et al.: The meeting project at ICSI. In: Proc. Human Language Technology Conf. (2001)

    Google Scholar 

  4. McCowan, I., Lincoln, M., Himawan, I.: Microphone array calibration in diffuse noise fields. IEEE Trans. on Acoustics, Speech, and Signal Processing (to appear, 2008)

    Google Scholar 

  5. Cook, R.K., et al.: Measurement of correlation coefficients in reverberant sound fields. The Journal of the Acoustical Society of America 27, 1072–1077 (1955)

    Article  Google Scholar 

  6. Torgerson, W.: Theory and Methods of Scaling. Wiley, New York (1958)

    Google Scholar 

  7. Cox, M.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and Hall (2001)

    Google Scholar 

  8. Di Biase, J.H., Silverman, H.F., Brandstein, M.S.: Robust localization in reverberant rooms. In: Brandstein, M.S., Ward, D.B. (eds.) Microphone Arrays, pp. 157–180. Springer, Heidelberg (2001)

    Google Scholar 

  9. Bitzer, J., Simmer, K.U.: Superdirective microphone arrays. In: Brandstein, M.S., Ward, D.B. (eds.) Microphone Arrays, pp. 19–38. Springer, Heidelberg (2001)

    Google Scholar 

  10. Cox, H., Zeskind, R., Owen, M.: Robust adaptive beamforming. IEEE Trans. on Acoustics, Speech, and Signal Processing 35, 1365–1376 (1987)

    Article  Google Scholar 

  11. Roweis, S.T.: Factorial models and refiltering for speech separation and denoising. In: Proc. of Eurospeech, pp. 1009–1012 (2003)

    Google Scholar 

  12. Maganti, H.K., Gatica-Perez, D., McCowan, I.: Speech enhancement and recognition in meetings with an audio-visual sensor array. IEEE Trans. on Acoustics, Speech, and Signal Processing 15, 2257–2269 (2007)

    Google Scholar 

  13. Lincoln, M., McCowan, I., Vepa, J., Maganti, H.K.: The multi-channel wall street journal audio visual corpus (mc-wsj-av): Specification and initial experiments. In: Proc. ASRU, pp. 357–362 (2005)

    Google Scholar 

  14. Moore, D., McCowan, I.: Microphone array speech recognition: Experiments on overlapping speech in meetings. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 5, pp. 497–500 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Andrei Popescu-Belis Steve Renals Hervé Bourlard

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Himawan, I., McCowan, I., Lincoln, M. (2008). Microphone Array Beamforming Approach to Blind Speech Separation. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2007. Lecture Notes in Computer Science, vol 4892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78155-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78155-4_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78154-7

  • Online ISBN: 978-3-540-78155-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics