Skip to main content
Log in

On Improving the Accuracy and Robustness of Time Delay Estimation of Broadband Signals

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The time delay between signals received at two spatially separated microphones is estimated using the location of the peak in the cross-correlation sequence between the two signals. The estimated delay is usually expressed in integer multiple of the sampling interval. The robustness of the delay estimation is affected by the degradation of the signals due to waveform distortion, noise, reflections, and reverberation. The broadband nature of speech signals is exploited to derive multiple evidence for the time delay from each frequency component of the signal. This helps to reduce the effects of waveform distortion, as the cross-correlation is computed on individual components. The multiple evidence provides robustness in the estimation of the time delay. The accuracy of the estimated time delay depends on the error between the integer delay and the true fractional delay. Methods are proposed to estimate the fractional part of the sampling interval of the delay. Robustness of the proposed method of estimation of the fractional part of the delay is examined for different types and levels of additive noise, as well for reverberation. Experimental results are given for data collected in an actual live room.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Given the coordinates of the source and microphones, the ground truth for the time delay is computed taking the velocity of sound in air as 330 m/sec.

  2. This is done using the VariableFractionalDelay class in MATLAB, which creates a new signal that is delayed with respect to the original signal by the input fractional delay.

  3. Fractional delay refers to a real number

References

  1. J.B. Allen, D.A. Berkley, Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)

    Article  Google Scholar 

  2. G. Aneeja, B. Yegnanarayana, Single frequency filtering approach for discriminating speech and nonspeech. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 705–717 (2015)

    Article  Google Scholar 

  3. R. Boucher, J. Hassab, Analysis of discrete implementation of generalized cross correlator. IEEE Trans. Acoust. Speech Sig. Process. 29(3), 609–611 (1981)

    Article  Google Scholar 

  4. C. Carter, Coherence and Time Delay Estimation: An Applied Tutorial for Research, Development, Test, and Evaluation Engineers (IEEE, Piscataway, NJ, 1993), p. 506

  5. B. Champagne, S. Bedard, A. Stephenne, Performance of time-delay estimation in the presence of room reverberation. IEEE Trans. Speech Audio Process. 4(2), 148–152 (1996)

    Article  Google Scholar 

  6. J. Chen, J. Benesty, Y. Huang, Time delay estimation in room acoustic environments: An overview. EURASIP J. Adv. Sig. Process. 1–19, 2006 (2006)

    MATH  Google Scholar 

  7. J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, N.L. Dahlgren, DARPA Limit Acoustic Phonetic Continuous Speech Corpus (CDROM NIST, 1993)

  8. W. He, P. Motlicek, J.M. Odobez, in 2018 IEEE International Conference on Robotics and Automation (ICRA). Deep Neural Networks for Multiple Speaker Detection and localization (IEEE, 2018), pp. 74–79

  9. L. Houegnigan, P. Safari, C. Nadeu, M. Schaar, M. Solé, in Proceedings of the Second International Conference on Computer Science, Information Technology and Applications. Neural Networks for High Performance Time-Delay Estimation and Acoustic Source Localization (2017) pp. 137–146

  10. https://www.audiolabs-erlangen.de/fau/professor/habets/software/rir-generator

  11. Y. Huang, J. Benesty, J. Chen, Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. J. Acoust. Soc. Am. 107 2000

  12. Y. Huang, J. Benesty, J. Chen, Adaptive Multichannel Time Delay Estimation Based on Blind System Identification for Acoustic Source Localization (Springer, Berlin, Heidelberg, 2003)

    Book  Google Scholar 

  13. Y. Huang, J. Benesty, J. Chen, Acoustic MIMO Signal Processing (Signals and Communication Technology) (Springer-Verlag, Berlin, Heidelberg, 2006)

    Book  Google Scholar 

  14. Y. Huang, J. Benesty, J. Chen, Time Delay Estimation and Source Localization, vol 51. (Springer, 2007)

  15. J. Ianniello, Time delay estimation via cross-correlation in the presence of large estimation errors. IEEE Trans. Acoust. Speech Sig. Process. 30(6), 998–1003 (1982)

    Article  Google Scholar 

  16. G. Jacovitti, G. Scarano, Discrete time techniques for time delay estimation. IEEE Trans. Sig. Process. 41(2), 525–533 (1993)

    Article  Google Scholar 

  17. C. Knapp, G. Carter, The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Sig. Process. 24(4), 320–327 (1976)

    Article  Google Scholar 

  18. R. Moddemeijer, On the determination of the position of extrema of sampled correlators. IEEE Trans. Sig. Process. 39(1), 216–219 (1991)

    Article  Google Scholar 

  19. B.H.V.S. Narayanamurthy, B. Yegnanarayana, K. Sudarsana Reddy, Time delay estimation from mixed multispeaker speech signals using single frequency filtering. Circuits Syst. Sig. Process. 39(4), 1988–2005 (2020)

  20. S. Shaltaf, Neural-network-based time-delay estimation. EURASIP J. Adv. Sig. Process. 2004(3), 1–8 (2004)

    Google Scholar 

  21. A. Stephenne, B. Champagne, in 1995 International Conference on Acoustics, Speech, and Signal Processing, vol 5. Cepstral Prefiltering for Time Delay Estimation in Reverberant Environments (1995), pp. 3055–3058

  22. A. Varga, H.J. Steeneken, Assessment for Automatic Speech Recognition II: NOISEX-92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Commun. 12(3), 247–251 (1993)

    Article  Google Scholar 

  23. Z.Q. Wang, X. Zhang, D. Wang, in Interspeech. Robust Tdoa Estimation Based on Time-Frequency Masking and Deep Neural Networks. (2018), pp. 322–326

Download references

Acknowledgements

B. Yegnanarayana would like to thank the Indian National Science Academy for their support

Funding

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. V. Satyanarayana.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Narayanamurthy, B.H.V.S., Satyanarayana, J.V. & Yegnanarayana, B. On Improving the Accuracy and Robustness of Time Delay Estimation of Broadband Signals. Circuits Syst Signal Process 41, 514–531 (2022). https://doi.org/10.1007/s00034-021-01794-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01794-7

Keywords

Navigation