Skip to main content

Complex Extension of Infinite Sparse Factor Analysis for Blind Speech Separation

  • Conference paper
Latent Variable Analysis and Signal Separation (LVA/ICA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7191))

Abstract

We present a method of blind source separation (BSS) for speech signals using a complex extension of infinite sparse factor analysis (ISFA) in the frequency domain. Our method is robust against delayed signals that usually occur in real environments, such as reflections, short-time reverberations, and time lags of signals arriving at microphones. ISFA is a conventional non-parametric Bayesian method of BSS, which has only been applied to time domain signals because it can only deal with real signals. Our method uses complex normal distributions to estimate source signals and mixing matrix. Experimental results indicate that our method outperforms the conventional ISFA in the average signal-to-distortion ratio (SDR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wölfel, M., McDonough, J.: Distant Speech Recognition. Wiley (2009)

    Google Scholar 

  2. Seltzer, M.L., Raj, B., Stern, R.M.: Likelihood-maximizing beamforming for robust hands-free speech recognition. IEEE Trans. on Speech and Audio Processing 12(5), 489–498 (2004)

    Article  Google Scholar 

  3. Nakadai, K., Takahashi, T., Okuno, H.G., Nakajima, H., Hasegawa, Y., Tsujino, H.: Design and Implementation of Robot Audition System ”HARK” Open Source Software for Listening to Three Simultaneous Speakers. Advanced Robotics 24(5–6), 739–761 (2010)

    Article  Google Scholar 

  4. Valin, J.M., Rouat, J., Michaud, F.: Enhanced robot audition based on microphone array source separation with post-filter. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2004, vol. 3, pp. 2123–2128. IEEE (2004)

    Google Scholar 

  5. Belouchrani, A., Abed-Meraim, K., Cardoso, J.F., Moulines, E.: A blind source separation technique using second-order statistics. IEEE Transactions on Signal Processing 45(2), 434–444 (1997)

    Article  Google Scholar 

  6. Hyvärinen, A., Karhunen, J., Oja, E.: Independent component analysis. Wiley Interscience (2001)

    Google Scholar 

  7. Knowles, D., Ghahramani, Z.: Infinite Sparse Factor Analysis and Infinite Independent Components Analysis. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 381–388. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Griffiths, T., Ghahramani, Z.: Infinite latent feature models and the Indian buffet process. Advances in Neural Information Processing Systems 18, 475–482 (2006)

    Google Scholar 

  9. Meeds, E., Ghahramani, Z., Neal, R.M., Roweis, S.T.: Modeling dyadic data with binary latent factors. Advances in Neural Information Processing Systems 19, 977–984 (2007)

    Google Scholar 

  10. Murata, N., Ikeda, S., Ziehe, A.: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 41(1-4), 1–24 (2001)

    Article  MATH  Google Scholar 

  11. Sawada, H., Mukai, R., Araki, S., Makino, S.: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. on Speech and Audio Processing 12(5), 530–538 (2004)

    Article  Google Scholar 

  12. Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Fabian Theis Andrzej Cichocki Arie Yeredor Michael Zibulevsky

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nagira, K., Takahashi, T., Ogata, T., Okuno, H.G. (2012). Complex Extension of Infinite Sparse Factor Analysis for Blind Speech Separation. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28551-6_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28550-9

  • Online ISBN: 978-3-642-28551-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics