Abstract
In this study, a blind speech dereverberation method in a noiseless single input multiple output acoustic channel is proposed. The method is based on multichannel linear prediction (MCLP) in STFT domain assuming sparsity in both residual speech and channel coefficients. The proposed speech dereverberation algorithm assumes that both the residual speech signal and the linear prediction coefficients is sparse. The optimization was performed by convex optimization using ADMM and CVX. The proposed model was compared with state of the art methods with lp norm optimization criteria. Simulations were evaluated in different room models with various reverberation times, numbers of microphones and parameter adjustments. The results show that the performance of the proposed method is superior in terms of speech dereverberation assessment criteria.
















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Babacan, S. D., Molina, R., Do, M. N., & Katsaggelos, A. K. (2012). Bayesian blind deconvolution with general sparse image priors. In: Proceedings of European conference of computer vision (ECCV), Florence, Italy (pp. 341–355).
Candes, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted minimization. Journal of Fourier Anals and Application, 14(5–6), 877–905.
Chartrand, R., & Yin, W. (2008). Iteratively reweighted algorithms for compressive sensing. In Proceedings of IEEE international conference of acoustics, speech, and signal processing (ICASSP), Las Vegas, NV, USA (pp. 3869–3872).
Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2009), Speech coding based on sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 2524–2528).
Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2012). Sparse linear prediction and its applications to speech processing. IEEE Transactions on Audio, Speech and Language Processing, 20(5), 1644–1657.
Hansen, P. C., & O’Leary, D. P. (1993). The use of the L-curve in the regularization of discrete ill-posed problems. SIAM Journal Scientific Computing, 14(6), 1487–1503.
Jensen, T. L., Giacobello, D., van Waterschoot, T., & Christensen, M. G. (2016a). Fast algorithms for high-order sparse linear prediction with applications to speech processing. Speech Communication, 76(2), 143–156.
Jensen T. L., Giacobello D., van Waterschoot T., & Christensen M. G. (2016b). Computational analysis of a fast algorithm for high-order sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 1–6).
Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo S. (2014). Speech dereverberation with multi-channel linear prediction and sparse priors for the desired signal. In Proceedings of joint workshop hands-free speech communication microphone arrays (HSCMA), Nancy, France (pp. 23–26).
Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo, S. (2015). Multi-channel linear prediction-based speech dereverberation with sparse priors. IEEE Transactions on Audio, Speech and Language Processing, 23(9), 1509–1520.
Kinoshita, K., Delcroix, M., Nakatani, T., & Miyoshi, T. (2009). Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 17(4), 534–545.
Moshirynia, M., Razzazi, F., & Haghbin, A. (2014), A speech dereverberation method using adaptive sparse dictionary learning. In Proceedings of REVERB challenge workshop (pp. 1–4).
Nakatani, T., Juang, B. H., Yoshioka, T., Kinoshita, K., Delcroix, M., & Miyoshi, M. (2008a). Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1512–1527.
Nakatani, T, Yoshioka, T, Kinoshita, K., Miyoshi, M., & Juang, B. H. (2008b). Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. In Proceedings of international conference acoustic speech and signal processing, Las Vegas, NV (pp. 85–88).
Nakatani, T., Yoshioka, T., Kinoshita, K., Miyoshi, M., & Juang, B. H. (2010). Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 18(7), 1717–1731.
Novey, M., Adali, T., & Roy, A. (2010). A complex generalized Gaussian distribution characterization, generation, and estimation. IEEE Transactions on Signal Processing, 58(3), 1427–1433.
Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Schmid, D., Enzner, G., Malik, S., Kolossa, D., & Martin, R. (2014). Variational Bayesian inference for multichannel dereverberation and noise reduction. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(8), 1320–1335.
Schwartz, B., Gannot, S., & Habets, E. A. P. (2013). Multi-microphone speech dereverberation using expectation-maximization and Kalman smoother. In Proceedings of European signal processing conference (EUSIPCO), Marrakech, Morocco (pp. 1–5).
Wipf, D., & Nagarajan, S. (2010). Iterative reweighted l1 and l2 methods for finding sparse solutions. IEEE Journal of Selective Topics on Signal Processing, 4(2), 317–329.
Wipf, D., & Zhang, H. (2013) Analysis of Bayesian blind deconvolution. In Proceedings of international conference of energy minimization methods and computational visual pattern recognition (EMMCVPR), Lund, Sweden, August 2013 (pp. 40–53).
Yoshioka, T. (2010). Speech enhancement in reverberant environment. PhD. Thesis, Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University.
Yoshioka, T., & Nakatani, T. (2012). Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening. IEEE Transactions on Audio, Speech and Language Processing, 20(10), 2707–2720.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mousavi, L., Razzazi, F. & Haghbin, A. Blind speech dereverberation using sparse decomposition and multi-channel linear prediction. Int J Speech Technol 22, 729–738 (2019). https://doi.org/10.1007/s10772-019-09620-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-019-09620-x