A Classification-Based Non-local Means Adaptive Filtering for Speech Enhancement and Its FPGA Prototype

Srinivas, Nagapuri; Pradhan, Gayadhar; Kumar, Puli Kishore

doi:10.1007/s00034-019-01267-y

A Classification-Based Non-local Means Adaptive Filtering for Speech Enhancement and Its FPGA Prototype

Published: 25 September 2019

Volume 39, pages 2489–2506, (2020)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Nagapuri Srinivas ORCID: orcid.org/0000-0003-0131-5517¹,
Gayadhar Pradhan¹ &
Puli Kishore Kumar²

481 Accesses
6 Citations
Explore all metrics

Abstract

Non-local mean (NLM) adaptive filtering is a well-explored technique for the denoising of images and electrocardiogram signals. In NLM filtering, the signal value at a particular sample point is estimated by a weighted average of sample points over a search neighborhood. The NLM filter effectively removes the noise when there are similarities among the samples of the signal over the search neighborhood. Due to the time-varying nature of the vocal-tract system and excitation source, the magnitude and frequency of the speech signal vary over the time. Consequently, NLM filtering is not effective in removing the noise components from the speech signal. The similarity among the sample points can be improved by classifying the speech signal into different categories depending on the magnitude and frequency components. In a given speech signal, the vowel-like speech (VLS) are high-magnitude regions compared to the other non-VLS. The vowel, semivowel and diphthong sound units are collectively termed as VLS. In this work, at the first level, the noisy speech signal is classified into VLS and non-VLS for improving similarity. Next, the non-local similarity present within the VLS and the non-VLS is exploited separately for an effective speech enhancement through NLM filtering. The experimental results presented in this study show that the proposed approach provides better denoising performance when compared with the NLM filtering without speech classification as well as recently reported speech enhancement methods. The hardware architecture of the proposed approach is also designed and prototyped on FPGA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modified Nonlocal Means Filtering for Speech Enhancement and Its FPGA Prototype

Article 01 June 2021

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Performance Evaluation of Various Speech Enhancement Algorithms

References

M. Berouti, R. Schwartz, J. Makhoul, Enhancement of speech corrupted by acoustic noise, in IEEE International Conference on Acoustics, Speech and Signal Processing vol. 4 (Washington, 1979), pp. 208–211
D. Bhoyar, S. Bera, C. Dethe, M. Mushrif, FPGA implementation of adaptive filter for noise cancellation, in 2014 International Conference on Electronics and Communication Systems (ICECS) (2014), pp. 1–5
S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)
Article Google Scholar
A. Buades, B. Coll, J.M. Morel, A review of image denoising algorithms, with a new one. Multiscale Model. Simul. 4(2), 490–530 (2005)
Article MathSciNet Google Scholar
N. Chatlani, J.J. Soraghan, Emd-based filtering (EMDF) of low-frequency noise for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 20(4), 1158–1166 (2012)
Article Google Scholar
I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 11(5), 466–475 (2003)
Article Google Scholar
G. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)
Article Google Scholar
K. Deepak, S.M. Prasanna, Foreground speech segmentation and enhancement using glottal closure instants and mel cepstral coefficients. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1205–1219 (2016)
Article Google Scholar
V. Digalakis, D. Rtischev, L. Neumeyer, Speaker adaptation using constrained estimation of Gaussian mixtures. IEEE Trans. Audio Speech Lang. Process. 3(5), 357–366 (1995)
Article Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109–1121 (1984)
Article Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443–445 (1985)
Article Google Scholar
M.J.F. Gales, Semi-tied covariance matrices for hidden Markov models. IEEE Trans. Audio Speech Lang. Process. 7(3), 272–281 (1999)
Article Google Scholar
J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S.Pallett, DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM, NIST speech disc 1-1.1. NASA STI/Recon Tech. Rep. 93 (1993)
T. Gerkmann, R.C. Hendriks, Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Trans. Audio Speech Lang. Process. 20(4), 1383–1393 (2012)
Article Google Scholar
P. Goel, M. Chandra, VLSI implementations of retimed high speed adaptive filter structures for speech enhancement. Microsyst. Technol. 24, 4799–4806 (2018)
Article Google Scholar
Y. Hu, P.C. Loizou, Evaluation of objective measures for speech enhancement, in Ninth International Conference on Spoken Language Processing (2006)
Y. Hu, P.C. Loizou, Subjective comparison of speech enhancement algorithms, in IEEE International Conference on Acoustics Speech and Signal Processing Proceedings vol. 1 (2006), pp. I–I
Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)
Article Google Scholar
Q. Jin, A. Waibel, Application of LDA to speaker recognition, in Proceedings of the Interspeech (2000), pp. 250–253
K. Khaldi, A.O. Boudraa, A. Bouchikhi, M.T.H. Alouane, Speech enhancement via EMD. EURASIP J. Adv. Signal Process. 2008, 873204 (2008)
Article Google Scholar
K. Khaldi, A.O. Boudraa, A. Komaty, Speech enhancement using empirical mode decomposition and the Teager–Kaiser energy operator. J. Acoust. Soc. Am. 135(1), 451–459 (2014)
Article Google Scholar
P. Krishnamoorthy, S.M. Prasanna, Enhancement of noisy speech by temporal and spectral processing. Speech Commun. 53(2), 154–174 (2011)
Article Google Scholar
J. Li, L. Deng, Y. Gong, R. Haeb-Umbach, An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)
Article Google Scholar
P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2013)
Book Google Scholar
Y. Lu, P.C. Loizou, Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Trans. Audio Speech Lang. Process. 19(5), 1123–1137 (2011)
Article Google Scholar
U. Mahbub, T. Rahman, A. Rashid, FPGA implementation of real time acoustic noise suppression by spectral subtraction using dynamic moving average method, in IEEE Symposium on Industrial Electronics and Applications, vol. 1 (2009), pp. 365–370
R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 9(5), 504–512 (2001)
Article Google Scholar
J. Ming, T.J. Hazen, J.R. Glass, D.A. Reynolds, Robust speaker recognition in noisy conditions. IEEE Trans. Audio Speech Lang. Process. 15(5), 1711–1723 (2007)
Article Google Scholar
M. Mukherjee, M. Maitra, et al. Reconfigurable architecture of adaptive median filter—an FPGA based approach for impulse noise suppression, in Third International Conference on Computer, Communication, Control and Information Technology (C3IT) (2015), pp. 1–6
S.J. Pinto, G. Panda, R. Peesapati, An implementation of hybrid control strategy for distributed generation system interface using Xilinx system generator. IEEE Trans. Ind. Inform. 13(5), 2735–2745 (2017)
Article Google Scholar
D. Povey, L. Burget, M. Agarwal, P. Akyazi, F. Kai, A. Ghoshal, O. Glembek, N. Goel, M. Karafiát, A. Rastrow, R.C. Rose, P. Schwarz, S. Thomas, The subspace Gaussian mixture model—a structured model for speech recognition. Comput. Speech Lang. 25(2), 404–439 (2011)
Article Google Scholar
G. Pradhan, B.C. Haris, S.R.M. Prasanna, R. Sinha, Speaker verification in sensor and acoustic environment mismatch conditions. Int. J. Speech Technol. 15, 381–392 (2012)
Article Google Scholar
G. Pradhan, S.M. Prasanna, Speaker verification by vowel and nonvowel like segmentation. IEEE Trans. Audio Speech Lang. Process. 21(4), 854–867 (2013)
Article Google Scholar
P. Singh, G. Pradhan, Exploring the non-local similarity present in variational mode functions for effective ECG denoising, in International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), pp. 861–865
P. Singh, G. Pradhan, S. Shahnawazuddin, Denoising of ECG signal by non-local estimation of approximation coefficients in DWT. Biocybern. Biomed. Eng. 37(3), 599–610 (2017)
Article Google Scholar
P. Singh, S. Shahnawazuddin, G. Pradhan, An efficient ECG denoising technique based on non-local means estimation and modified empirical mode decomposition. Circuits Syst. Signal Process. 37(10), 4527–4547 (2018)
Article Google Scholar
N. Srinivas, P.K. Kumar, A fast carry chain adder using instantiation design entry on virtex-5 FPGA, in International Conference on Electrical, Computer and Electronics Engineering (2016), pp. 106–109
N. Srinivas, P.K. Kumar, G. Pradhan, Low latency architecture design and implementation for short-time fourier transform algorithm on FPGA, in International Conference on Microwaves, Antennas, Communications and Electronic Systems (2017), pp. 1–5
N. Srinivas, G. Pradhan, P.K. Kumar, An efficient hardware architecture for detection of vowel-like regions in speech signal. Integration 63, 185–195 (2018)
Article Google Scholar
N. Srinivas, G. Pradhan, P.K. Kumar, Detection of vowel-like speech: an efficient hardware architecture and it’s FPGA prototype. Microsyst. Technol. 25(4), 1333–1343 (2019)
Article Google Scholar
N. Srinivas, G. Pradhan, S. Shahnawazuddin, Enhancement of noisy speech signal by non-local means estimation of variational mode functions. Proc. Interspeech 2018, 1156–1160 (2018)
Article Google Scholar
R. Tavares, R. Coelho, Speech enhancement with nonstationary acoustic noise detection in time domain. IEEE Signal Process. Lett. 23(1), 6–10 (2016)
Article Google Scholar
B.H. Tracey, E.L. Miller, Nonlocal means denoising of ECG signals. IEEE Trans. Biomed. Eng. 59(9), 2383–2386 (2012)
Article Google Scholar
A. Upadhyay, R. Pachori, Speech enhancement based on mEMD-VMD method. Electron. Lett. 53(7), 502–504 (2017)
Article Google Scholar
D. Van De Ville, M. Kocher, Sure-based non-local means. IEEE Signal Process. Lett. 16(11), 973–976 (2009)
Article Google Scholar
A. Varga, H.J. Steeneken, Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)
Article Google Scholar
N. Virag, Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech Audio Process. 7(2), 126–137 (1999)
Article Google Scholar
B. Yegnanarayana, C. Avendano, H. Hermansky, P.S. Murthy, Speech enhancement using linear prediction residual. Speech Commun. 28(1), 25–42 (1999)
Article Google Scholar
L. Zao, R. Coelho, P. Flandrin, Speech enhancement with EMD and hurst-based mode selection. IEEE/ACM Trans. Audio Speech Lang. Process. 22(5), 899–911 (2014)
Article Google Scholar

Download references

Acknowledgements

This research work is a sub-module of the project “Development of Speech Based Person Authentication System on FPGA” under SMDP-C2SD (9(I)/2014-MDD) program and is supported by the Ministry of Electronics and Information Technology (Meity), Government of India.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology Patna, Patna, India
Nagapuri Srinivas & Gayadhar Pradhan
Department of Electronics and Communication Engineering, National Institute of Technology, Andhra Pradesh, Tadepalligudem, India
Puli Kishore Kumar

Authors

Nagapuri Srinivas
View author publications
You can also search for this author in PubMed Google Scholar
Gayadhar Pradhan
View author publications
You can also search for this author in PubMed Google Scholar
Puli Kishore Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nagapuri Srinivas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Srinivas, N., Pradhan, G. & Kumar, P.K. A Classification-Based Non-local Means Adaptive Filtering for Speech Enhancement and Its FPGA Prototype. Circuits Syst Signal Process 39, 2489–2506 (2020). https://doi.org/10.1007/s00034-019-01267-y

Download citation

Received: 01 December 2018
Revised: 15 September 2019
Accepted: 16 September 2019
Published: 25 September 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00034-019-01267-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Classification-Based Non-local Means Adaptive Filtering for Speech Enhancement and Its FPGA Prototype

Abstract

Access this article

Similar content being viewed by others

Modified Nonlocal Means Filtering for Speech Enhancement and Its FPGA Prototype

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Performance Evaluation of Various Speech Enhancement Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Classification-Based Non-local Means Adaptive Filtering for Speech Enhancement and Its FPGA Prototype

Abstract

Access this article

Similar content being viewed by others

Modified Nonlocal Means Filtering for Speech Enhancement and Its FPGA Prototype

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Performance Evaluation of Various Speech Enhancement Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation