Skip to main content

Improving Relative Transfer Function Estimates Using Second-Order Cone Programming

  • Conference paper
  • First Online:
Latent Variable Analysis and Signal Separation (LVA/ICA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9237))

Abstract

This paper addresses the estimation of Relative Transfer Function (RTF) between microphones from noisy recordings. We utilize an incomplete initial measurement of the RTF, which is known for only several frequency bins. The measurement is completed by finding its sparsest representation in the time domain. We propose to perform this reconstruction by solving a Second-Order Cone Program (SOCP). Free parameters of this formulation represent distance of the completed RTF from the initial estimate. We select these parameters based on the theoretical performance of the initial estimate. In experiments with real-world data, this approach achieves a significant refinement of the RTF, especially in scenarios with low signal-to-noise ratios.

This work was supported by The Czech Sciences Foundation through Project No. 14-11898S.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The right channel \(X_\mathrm{R}\) as well as H are typically delayed by a few samples due to possible acausality of \(H_\mathrm{RTF}\). We omit this detail here for the sake of simplicity of the notation.

  2. 2.

    The variance of FD under the model is also derived in [2] and could be taken into account. The bias, however, seems to have a larger influence on the entire accuracy of FD; we therefore focus on the bias.

  3. 3.

    The kurtosis-based selection appears to be efficient when the frequency components of the target signal have non-Gaussian distribution while those of the noise are Gaussian; see Sect. 5 in [5]. In real-world situations, this is often satisfied when the target signal is speech and the noise is quasi-stationary.

  4. 4.

    http://sisec.wiki.irisa.fr/.

  5. 5.

    http://www.freesound.org/.

  6. 6.

    http://www.eng.biu.ac.il/gannot/downloads/.

References

  1. Koldovský, Z., Málek, J., Tichavský, P., Nesta, F.: Semi-blind noise extraction using partially known position of the target source. IEEE Trans. Speech Audio Lang. Process. 21(10), 2029–2041 (2013)

    Article  Google Scholar 

  2. Shalvi, O., Weinstein, E.: System identification using nonstationary signals. IEEE Trans. Sig. Process. 44(8), 2055–2063 (1996)

    Article  Google Scholar 

  3. Parra, L., Spence, C.: Convolutive blind separation of non-stationary sources. IEEE Trans. Speech Audio Process. 8(3), 320–327 (2000)

    Article  Google Scholar 

  4. Talmon, R., Gannot, S.: Relative transfer function identification on manifolds for supervised GSC beamformers. In: Proceedings of the 21st European Signal Processing Conference (EUSIPCO), Marrakech, Morocco, pp. 1–5, September 2013

    Google Scholar 

  5. Koldovský, Z., Málek, J., Gannot, S.: Spatial source subtraction based on incomplete measurements of relative transfer function. IEEE Trans. Speech Audio Lang. Process. (2015)

    Google Scholar 

  6. Koldovský, Z., Tichavský, P.: Sparse Reconstruction of Incomplete Relative Transfer Function: Discrete and Continuous Time Domain. In: Accepted for a Special Session at EUSIPCO 2015, Nice, France, September 2015

    Google Scholar 

  7. Takahashi, Y., Takatani, T., Osako, K., Saruwatari, H., Shikano, K.: Blind spatial subtraction array for speech enhancement in noisy environment. IEEE Trans. Audio Speech Lang. Process. 17(4), 650–664 (2009)

    Article  Google Scholar 

  8. Nesta, F., Matassoni, M., Astudillo, R.F.: A flexible spatial blind source extraction framework for robust speech recognition in noisy environments. In: Proceedings of the 2nd CHiME Workshop on Machine Listening in Multisource Environment, pp. 33–40, June 2013

    Google Scholar 

  9. Domahidi, A., Chu, E., Boyd, S.: ECOS: An SOCP solver for embedded systems. In: Proceedings of European Control Conference, Zurich, pp. 3071–3076, July 2013

    Google Scholar 

  10. Benichoux, A., Simon, L.S.R., Vincent, E., Gribonval, R.: Convex regularizations for the simultaneous recording of room impulse responses. IEEE Trans. Signal Process. 62(8), 1976–1986 (2014)

    Article  MathSciNet  Google Scholar 

  11. Hadad, E., Heese, F., Vary, P., Gannot, S.: Multichannel audio database in various acoustic environments. In: International Workshop on Acoustic Signal Enhancement 2014 (IWAENC 2014), Antibes, France, pp. 313–317, September 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zbyněk Koldovský .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Koldovský, Z., Málek, J., Tichavský, P. (2015). Improving Relative Transfer Function Estimates Using Second-Order Cone Programming. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2015. Lecture Notes in Computer Science(), vol 9237. Springer, Cham. https://doi.org/10.1007/978-3-319-22482-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22482-4_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22481-7

  • Online ISBN: 978-3-319-22482-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics