Skip to main content

Endpoint Detection and De-noising Method Based on Multi-resolution Spectrogram

  • Conference paper
  • First Online:
  • 1807 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9772))

Abstract

The paper studied endpoint detection algorithm of noisy speech, since the visual differences of spectrogram employed by speech and noise, the paper chose spectrogram endpoint detection methods. Technical difficulties of spectrogram endpoint detection is how to describe the intuitive difference of spectrogram by mathematical amount, according to the descriptive power of autocorrelation coefficients on texture features, the paper described the difference by selecting the autocorrelation function, and proposed column autocorrelation spectrogram detection method. Through the distribution of spectrogram self-correlation function, as the threshold of endpoint detection for the noisy speech, the cut-off point between speech and noise was found out. Since the paper used broadband spectrogram, which employed poor frequency resolution, so there were still residual noise in speech column after autocorrelation spectrum detection, in order to further de-noising in different bands, combined with the multi resolution of empirical mode decomposition (EMD), the paper analyzed the noisy speech by multi-resolution, the target was broken down into different frequency scales and was further analyzed by column autocorrelation spectrogram, experiments shown that the noise reduction effect for noisy speech was ideal.

This work is supported by Guangdong Provincial Science and technology projects#2013B040401015.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. You, K.H., Wang, H.: Robust features for noisy speech recognition based on temporal trajectory fitting of short-time autocorrelation sequences. Speech Commun. 28(99), 13–24 (1999)

    Google Scholar 

  2. Haweel, T.I., Haweel, M.T.: Adaptive multichannel LMS signal decoupling. In: IEEE International Conference on Communications, Signal Processing, and their Applications, pp. 1–4 (2015)

    Google Scholar 

  3. Sase, T., Ramírez, J.P., Kitajo, K.: Estimating the level of dynamical noise in time series by using fractal dimensions. Phys. Lett. A 380(11–12), 1151–1163 (2016)

    Article  MathSciNet  Google Scholar 

  4. Zhang, X., Mei, C., Chen, D.: Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recogn. 56(1), 1–15 (2016)

    Google Scholar 

  5. Obin, N., Liu, N.M.: On the generalization of Shannon entropy for speech recognition. In: Spoken Language Technology Workshop (SLT), vol. 8537, no. 11, pp. 97–102 (2012)

    Google Scholar 

  6. Qiu-fang, A., Xiao-Jun, W.: A method for endpoint detection of speech using FBV based on harmonious analysis. Comput. Simul. 26(8), 330–333 (2009)

    Google Scholar 

  7. Lareau, J., Lareau, J: Application of Shifted Delta Cepstral Features for GMM Language Identification (2006)

    Google Scholar 

  8. Xiang-min, C., Zhang, J., Wei, G.: A speech endpoint detection algorithm based on spectrogram. Audio Eng. 4(8), 46–49 (2006)

    Google Scholar 

  9. Xiao, C., Sun, D., Gao, Y.: A speech enhancement algorithm based on speech spectrogram. Audio Eng. 36(9), 44–48 (2012)

    Google Scholar 

  10. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Electronics Industry Press, Beijing (2011)

    Google Scholar 

  11. Wang, X., Shen, H., Zhang, W.: Image mosaic by using the local autocorrelation algorithm in triangular geometric constraints. Opto-Electron. Eng. 42(4), 32–37 (2015)

    Google Scholar 

  12. Wang, L., Sun, Y.: Gastroscopy image retrieval based on color-texture autocorrelation algorithm. J. Circ. Syst. 16(2), 46–50 (2011)

    Google Scholar 

  13. Soon, I.Y., Koh, S.N.: Speech enhancement using 2-D Fourier transform. IEEE Trans. Speech Audio Process. 11(6), 717–724 (2003)

    Article  Google Scholar 

  14. Zhao, l: Speech Signal Processing. China Machine Press, Beijing (2009)

    Google Scholar 

  15. Sun, Yan-kui: Wavelet Transform and Image Processing Techniques. Tsinghua University Press, Beijing (2012)

    Google Scholar 

  16. Huang, N.E., Shen, Z., Long, S.R.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. R. Soc. London Proc. 454(1971), 903–993 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  17. Fu, J., Wang, S.W., Cao, X.L.: The research on speech endpoint detection algorithm based on spectrogram row self-correlation. In: 2nd International Conference on Computer Science and Network Technology, pp. 212–216 (2012)

    Google Scholar 

Download references

Acknowledgment

This work is supported by Guangdong Provincial Science and technology projects#2013B040401015.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, J. (2016). Endpoint Detection and De-noising Method Based on Multi-resolution Spectrogram. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2016. Lecture Notes in Computer Science(), vol 9772. Springer, Cham. https://doi.org/10.1007/978-3-319-42294-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42294-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42293-0

  • Online ISBN: 978-3-319-42294-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics