Blind speech dereverberation using sparse decomposition and multi-channel linear prediction

Mousavi, Leila; Razzazi, Farbod; Haghbin, Afrooz

doi:10.1007/s10772-019-09620-x

Blind speech dereverberation using sparse decomposition and multi-channel linear prediction

Published: 15 July 2019

Volume 22, pages 729–738, (2019)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

259 Accesses
Explore all metrics

Abstract

In this study, a blind speech dereverberation method in a noiseless single input multiple output acoustic channel is proposed. The method is based on multichannel linear prediction (MCLP) in STFT domain assuming sparsity in both residual speech and channel coefficients. The proposed speech dereverberation algorithm assumes that both the residual speech signal and the linear prediction coefficients is sparse. The optimization was performed by convex optimization using ADMM and CVX. The proposed model was compared with state of the art methods with l_p norm optimization criteria. Simulations were evaluated in different room models with various reverberation times, numbers of microphones and parameter adjustments. The results show that the performance of the proposed method is superior in terms of speech dereverberation assessment criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint $$L1-L2$$ Regularisation for Blind Speech Deconvolution

Blind Speech Deconvolution via Pretrained Polynomial Dictionary and Sparse Representation

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

Article Open access 06 December 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Babacan, S. D., Molina, R., Do, M. N., & Katsaggelos, A. K. (2012). Bayesian blind deconvolution with general sparse image priors. In: Proceedings of European conference of computer vision (ECCV), Florence, Italy (pp. 341–355).
Candes, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted minimization. Journal of Fourier Anals and Application, 14(5–6), 877–905.
Article MathSciNet MATH Google Scholar
Chartrand, R., & Yin, W. (2008). Iteratively reweighted algorithms for compressive sensing. In Proceedings of IEEE international conference of acoustics, speech, and signal processing (ICASSP), Las Vegas, NV, USA (pp. 3869–3872).
Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2009), Speech coding based on sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 2524–2528).
Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2012). Sparse linear prediction and its applications to speech processing. IEEE Transactions on Audio, Speech and Language Processing, 20(5), 1644–1657.
Article Google Scholar
Hansen, P. C., & O’Leary, D. P. (1993). The use of the L-curve in the regularization of discrete ill-posed problems. SIAM Journal Scientific Computing, 14(6), 1487–1503.
Article MathSciNet MATH Google Scholar
Jensen, T. L., Giacobello, D., van Waterschoot, T., & Christensen, M. G. (2016a). Fast algorithms for high-order sparse linear prediction with applications to speech processing. Speech Communication, 76(2), 143–156.
Article Google Scholar
Jensen T. L., Giacobello D., van Waterschoot T., & Christensen M. G. (2016b). Computational analysis of a fast algorithm for high-order sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 1–6).
Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo S. (2014). Speech dereverberation with multi-channel linear prediction and sparse priors for the desired signal. In Proceedings of joint workshop hands-free speech communication microphone arrays (HSCMA), Nancy, France (pp. 23–26).
Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo, S. (2015). Multi-channel linear prediction-based speech dereverberation with sparse priors. IEEE Transactions on Audio, Speech and Language Processing, 23(9), 1509–1520.
Article Google Scholar
Kinoshita, K., Delcroix, M., Nakatani, T., & Miyoshi, T. (2009). Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 17(4), 534–545.
Article Google Scholar
Moshirynia, M., Razzazi, F., & Haghbin, A. (2014), A speech dereverberation method using adaptive sparse dictionary learning. In Proceedings of REVERB challenge workshop (pp. 1–4).
Nakatani, T., Juang, B. H., Yoshioka, T., Kinoshita, K., Delcroix, M., & Miyoshi, M. (2008a). Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1512–1527.
Article Google Scholar
Nakatani, T, Yoshioka, T, Kinoshita, K., Miyoshi, M., & Juang, B. H. (2008b). Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. In Proceedings of international conference acoustic speech and signal processing, Las Vegas, NV (pp. 85–88).
Nakatani, T., Yoshioka, T., Kinoshita, K., Miyoshi, M., & Juang, B. H. (2010). Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 18(7), 1717–1731.
Article Google Scholar
Novey, M., Adali, T., & Roy, A. (2010). A complex generalized Gaussian distribution characterization, generation, and estimation. IEEE Transactions on Signal Processing, 58(3), 1427–1433.
Article MathSciNet MATH Google Scholar
Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Article Google Scholar
Schmid, D., Enzner, G., Malik, S., Kolossa, D., & Martin, R. (2014). Variational Bayesian inference for multichannel dereverberation and noise reduction. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(8), 1320–1335.
Article Google Scholar
Schwartz, B., Gannot, S., & Habets, E. A. P. (2013). Multi-microphone speech dereverberation using expectation-maximization and Kalman smoother. In Proceedings of European signal processing conference (EUSIPCO), Marrakech, Morocco (pp. 1–5).
Wipf, D., & Nagarajan, S. (2010). Iterative reweighted l1 and l2 methods for finding sparse solutions. IEEE Journal of Selective Topics on Signal Processing, 4(2), 317–329.
Article Google Scholar
Wipf, D., & Zhang, H. (2013) Analysis of Bayesian blind deconvolution. In Proceedings of international conference of energy minimization methods and computational visual pattern recognition (EMMCVPR), Lund, Sweden, August 2013 (pp. 40–53).
Yoshioka, T. (2010). Speech enhancement in reverberant environment. PhD. Thesis, Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University.
Yoshioka, T., & Nakatani, T. (2012). Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening. IEEE Transactions on Audio, Speech and Language Processing, 20(10), 2707–2720.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
Leila Mousavi, Farbod Razzazi & Afrooz Haghbin

Authors

Leila Mousavi
View author publications
You can also search for this author in PubMed Google Scholar
Farbod Razzazi
View author publications
You can also search for this author in PubMed Google Scholar
Afrooz Haghbin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farbod Razzazi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mousavi, L., Razzazi, F. & Haghbin, A. Blind speech dereverberation using sparse decomposition and multi-channel linear prediction. Int J Speech Technol 22, 729–738 (2019). https://doi.org/10.1007/s10772-019-09620-x

Download citation

Received: 16 January 2019
Accepted: 16 June 2019
Published: 15 July 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s10772-019-09620-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Blind speech dereverberation using sparse decomposition and multi-channel linear prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint $$L1-L2$$ Regularisation for Blind Speech Deconvolution

Blind Speech Deconvolution via Pretrained Polynomial Dictionary and Sparse Representation

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Blind speech dereverberation using sparse decomposition and multi-channel linear prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint $$L1-L2$$ Regularisation for Blind Speech Deconvolution

Blind Speech Deconvolution via Pretrained Polynomial Dictionary and Sparse Representation

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation