Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN

Khan, Nadia Masood; Khan, Gul Muhammad

doi:10.1007/s11063-020-10379-5

Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN

Published: 17 November 2020

Volume 53, pages 227–255, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

394 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

This paper proposes a novel NeuroEvolutionary algorithm called Enhanced Cartesian Genetic Programming evolved Artificial Neural Network (ECGPANN) as a predictor for the lost signal samples in real time. Unlike traditional Cartesian Genetic Programming evolved Artificial Neural Network (CGPANN), the proposed algorithm introduces bi-chromosomal architecture instead of single chromosome to perform parallel evolution of topology with weights and architecture. This modification makes it suitable for obtaining global optimum solutions to predict both periodic and aperiodic lost samples at run-time. Sliding Window based Multi-instance Linear Regression (SW-MLR) and Sliding Window based Multi-instance Random Forest (SW-MRF) prediction algorithms are also exploited for the reconstruction of multiple missing samples. SW-MLR and SW-MRF being trained on fixed input/output cannot be utilized for random signal loss due to dynamic nature of number of output estimations needed at run-time. ECGPANN has the flexibility to produce variable number of outputs in real-time. Experimental results demonstrates the efficacy of the ECGPANN for both single and multi-sample loss with fix periodic and aperiodic noise using sliding window technique. The SNR improvement achieved ranges from 20 to 37 dB for periodic noise and 31–44 dB for aperiodic noise with signals having 16.6–50% samples missing. ECGPANN when compared in terms of its performance with the traditional CGPANN produced 4–5% improvement in prediction accuracy on average. The proposed ECGAPNN model is able to achieve a mean absolute error (MAE) of 0.051 (speech), 0.015 (guitar) and 0.038 (flute) for 16.6% lost/corrupted signals. MAE of 0.066 (speech), 0.020 (guitar) and 0.049 (flute) for 50% lost/corrupted data has been reported. The networks are trained and tested on audio speech signal and evaluated on music signals for its generality, with ECGPANN performing consistently better irrespective of the change in type of signals and demonstrated its robustness with change in number of missing samples in contrast to SW-MLR and SW-MRF. The ability to predict randomly variable number of missing samples make it applicable in real time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Fig. 9

Multi-chromosomal CGP-evolved RNN for signal reconstruction

Article 17 April 2021

Signal Reconstruction Using Evolvable Recurrent Neural Networks

A Machine Hearing Framework for Real-Time Streaming Analytics Using Lambda Architecture

Notes

References

Ahmad AM, Khan GM, Mahmud SA (2013) Classification of arrhythmia types using cartesian genetic programming evolved artificial neural networks. In: International conference on engineering applications of neural networks. Springer, pp 282–291
Alexander DC, Zikic D, Zhang J, Zhang H, Criminisi A (2014) Image quality transfer via random forest regression: applications in diffusion MRI. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 225–232
Aras S, Kocakoç İD (2016) A new model selection strategy in time series forecasting with artificial neural networks: IHTS. Neurocomputing 174:974–987
Article Google Scholar
Balasundaram S, Gupta D (2014) Training Lagrangian twin support vector regression via unconstrained convex minimization. Knowl-Based Syst 59:85–96
Article Google Scholar
Bartkowiak M, Latanowicz B (2010) Mitigation of long gaps in music using hybrid sinusoidal+ noise model with context adaptation. In: 2010 International conference on signals and electronic systems (ICSES). IEEE, pp 435–438
Bhardwaj A, Tiwari A (2015) Breast cancer diagnosis using genetically optimized neural network model. Expert Syst Appl 42(10):4611–4620
Article Google Scholar
Bontempi G (2008) Long term time series prediction with multi-input multi-output local learning. In: Proceedings of the 2nd ESTSP, pp 145–154
Borchani H, Varando G, Bielza C, Larrañaga P (2015) A survey on multi-output regression. Wiley Interdiscip Rev Data Min Knowl Discov 5(5):216–233
Article Google Scholar
Boufounos PT (2009) Greedy sparse signal reconstruction from sign measurements. In: 2009 Conference record of the forty-third Asilomar conference on signals, systems and computers. IEEE, pp 1305–1309
Ebner PP, Eltelt A (2020) Audio inpainting with generative adversarial network. ArXiv preprint arXiv:2003.07704
Elharrouss O, Almaadeed N, Al-Maadeed S, Akbari Y (2019) Image inpainting: a review. Neural Process Lett 51:2007–2028. https://doi.org/10.1007/s11063-019-10163-0
Article Google Scholar
Etter W (1996) Restoration of a discrete-time signal segment by interpolation based on the left-sided and right-sided autoregressive parameters. IEEE Trans Signal Process 44(5):1124–1135
Article Google Scholar
Frank E, Pfahringer B (2013) Propositionalisation of multi-instance data using random forests. In: Cranefield S, Nayak A (eds) AI 2013: advances in artificial intelligence. AI 2013. Lecture Notes in Computer Science, vol 8272. Springer
Godsill S, Rayner P, Cappé O (2002) Digital audio restoration. In: Applications of digital signal processing to audio and acoustics. Springer, pp 133–194
Hammarqvist U (2011) Audio editing in the time-frequency domain using the Gabor Wavelet Transform. Independent thesis, Advanced level
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin
Book Google Scholar
Huang L, Xia Y, Huang L, Zhang S (2019) Two matrix-type projection neural networks for matrix-valued optimization with application to image restoration. Neural Process Lett. https://doi.org/10.1007/s11063-019-10086-w
Huang N, Lu G, Xu D (2016) A permutation importance-based feature selection method for short-term electricity load forecasting using random forest. Energies 9(10):767
Article Google Scholar
Khan GM, Ahmad A (2018) Breaking the stereotypical dogma of artificial neural networks with cartesian genetic programming. Inspired by Nature, pp 213–233
Khan GM, Ali J, Mahmud S (2014) Wind power forecasting—an application of machine learning in renewable energy. In: Proceedings of the international joint conference on neural networks, pp 1130–1137. https://doi.org/10.1109/IJCNN.2014.6889771
Khan GM, Arshad R (2016) Electricity peak load forecasting using CGP based neuro evolutionary techniques. Int J Comput Intell Syst 9(2):376–395
Article Google Scholar
Khan GM, Ullah F, Mahmud SA (2013) MPEG-4 internet traffic estimation using recurrent CGPANN. In: Engineering applications of neural networks: 14th international conference, EANN 2013, Halkidiki, Greece, Sept 13–16, 2013 Proceedings, Part I, pp 22–31. https://doi.org/10.1007/978-3-642-41013-0_3
Khan GM, Zafari F, Mahmud SA (2013) Very short term load forecasting using cartesian genetic programming evolved recurrent neural networks (CGPRNN). In: 12th international conference on machine learning and applications, ICMLA 2013, Miami, FL, USA, Dec 4–7, 2013, vol 2, pp 152–155. https://doi.org/10.1109/ICMLA.2013.181
Khan MM, Khan GM, Miller JF (2010) Evolution of neural networks using cartesian genetic programming. In: IEEE congress on evolutionary computation. IEEE, pp 1–8
Khan NM, Khan GM (2017) Audio signal reconstruction using cartesian genetic programming evolved artificial neural network (CGPANN). In: Chen X, Luo B, Luo F, Palade V, Wani MA (eds) 16th IEEE international conference on machine learning and applications, ICMLA 2017, Cancun, Mexico, Dec 18–21, 2017. IEEE, pp 568–573. https://doi.org/10.1109/ICMLA.2017.0-100
Khan NM, Khan GM (2018) Signal reconstruction using evolvable recurrent neural networks. In: International conference on intelligent data engineering and automated learning. Springer, pp 594–602
Lagrange M, Marchand S, Rault JB (2005) Long interpolation of audio signals using linear prediction in sinusoidal modeling. J Audio Eng Soc 53(10):891–905
Google Scholar
Li C, Lu B, Zhang Y, Liu H, Qu Y (2018) 3d reconstruction of indoor scenes via image registration. Neural Process Lett 48(3):1281–1304
Article Google Scholar
Linusson H (2013) Multi-output random forests. Independent thesis Advanced level (degree of Master (One Year)). University of Borås, School of Business and IT, 2013. https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1309070&dswid=6848
Mathe M, Nandyala SP, Kumar TK (2012) Speech enhancement using Kalman filter for white, random and color noise. In: 2012 International conference on devices, circuits and systems (ICDCS). IEEE, pp 195–198
Mehri S, Kumar K, Gulrajani I, Kumar R, Jain S, Sotelo J, Courville A, Bengio Y (2016) Samplernn: an unconditional end-to-end neural audio generation model. ArXiv preprint arXiv:1612.07837
Miller JF (2011) Cartesian genetic programming. In: Cartesian genetic programming. Springer, pp 17–34
Miller JF, Thomson P (2000) Cartesian genetic programming. In: European conference on genetic programming. Springer, pp 121–132
Mousavi A, Dasarathy G, Baraniuk RG (2017) DeepCodec: adaptive sensing and recovery via deep convolutional neural networks. ArXiv preprint arXiv:1707.03386
Nisan N (1992) Pseudorandom generators for space-bounded computation. Combinatorica 12(4):449–461
Article MathSciNet Google Scholar
Oord AVD, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: a generative model for raw audio. ArXiv preprint arXiv:1609.03499
Oudre L (2018) Interpolation of missing samples in sound signals based on autoregressive modeling. Image Process On Line 8:329–344
Oudre L (2015) Automatic detection and removal of impulsive noise in audio signals. Image Process On Line 5:267–281
Article MathSciNet Google Scholar
Oyamada K, Kameoka H, Kaneko T, Tanaka K, Hojo N, Ando H (2018) Generative adversarial network-based approach to signal reconstruction from magnitude spectrogram. In: 2018 26th European signal processing conference (EUSIPCO). IEEE, pp 2514–2518
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830
MathSciNet MATH Google Scholar
Petukhova T, Ojkic D, McEwen B, Deardon R, Poljak Z (2018) Assessment of autoregressive integrated moving average (ARIMA), generalized linear autoregressive moving average (GLARMA), and random forest (RF) time series regression models for predicting influenza a virus frequency in swine in Ontario, Canada. PloS one 13(6):e0198313
Article Google Scholar
Potter LC, Arun K (1989) Energy concentration in band-limited extrapolation. IEEE Trans Acoust Speech Signal Process 37(7):1027–1041
Article Google Scholar
Rehman M, Ali J, Khan GM, Mahmud S (2014) Extracting trends ensembles in solar irradiance for green energy generation using neuro-evolution. In: IFIP advances in information and communication technology, vol 436. https://doi.org/10.1007/978-3-662-44654-6_45
Rodriguez-Galiano V, Sanchez-Castillo M, Chica-Olmo M, Chica-Rivas M (2015) Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol Rev 71:804–818
Article Google Scholar
Scott HRR, Wilson R (1995) A multiresolution audio restoration algorithm. In: IEEE ASSP workshop on applications of signal processing to audio and acoustics, 1995. IEEE, pp 151–154
Shanmugam A, Raja MA, Lakshmi SV, Adlinvini V, Ashwin M, Ajeesh PP (2013) Adaptive noise cancellation for speech processing in real time environment. Int J Eng Res Appl (IJERA) 3(2):1102–1106
Google Scholar
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423
Article MathSciNet Google Scholar
Stanley KO, Clune J, Lehman J, Miikkulainen R (2019) Designing neural networks through neuroevolution. Nat Mach Intell 1(1):24–35
Article Google Scholar
Taieb SB, Sorjamaa A, Bontempi G (2010) Multiple-output modeling for multi-step-ahead time series forecasting. Neurocomputing 73(10–12):1950–1957
Article Google Scholar
Turner AJ, Miller JF (2013) Cartesian genetic programming encoded artificial neural networks: a comparison using three benchmarks. In: Proceedings of the 15th annual conference on genetic and evolutionary computation. ACM, pp 1005–1012
Uncini A (2003) Audio signal processing by neural networks. Neurocomputing 55(3–4):593–625
Article Google Scholar
Valsecchi A, Damas S, Tubilleja C, Arechalde J (2020) Stochastic reconstruction of 3D porous media from 2D images using generative adversarial networks. Neurocomputing 399:227–236. https://doi.org/10.1016/j.neucom.2019.12.040
Article Google Scholar
Vapnik V (2013) The nature of statistical learning theory. Springer, Berlin
MATH Google Scholar
Vaseghi SV (1996) Spectral subtraction. In: Advanced signal processing and digital noise reduction. Springer, pp 242–260
Vaseghi SV, Rayner P (1990) Detection and suppression of impulsive noise in speech communication systems. IEE Proc I Commun Speech Vis 137(1):38–46
Article Google Scholar
Wagstaff KL, Lane T, Roper A (2008) Multiple-instance regression with structured data. In: 2008 IEEE international conference on data mining workshops, pp 291–300
Wang Z, Lan L, Vucetic S (2011) Mixture model for multiple instance regression and applications in remote sensing. IEEE Trans Geosci Remote Sens 50:2226–2237
Article Google Scholar
Wolfe PJ, Godsill SJ (2003) A Gabor regression scheme for audio signal analysis. In: 2003 IEEE workshop on applications of signal processing to audio and acoustics. IEEE, pp 103–106
Wolfe PJ, Godsill SJ (2005) Interpolation of missing data values for audio signal restoration using a Gabor regression model. In: IEEE international conference on acoustics, speech, and signal processing, 2005. Proceedings (ICASSP’05), vol 5. IEEE, pp v–517
Xia Y, Wang P (2013) Speech enhancement in presence of colored noise using an improved least square estimation. In: Proceedings of 3rd international conference on multimedia technology (ICMT-13)
Zhou J, Qian H, Lu X, Duan Z, Huang H, Shao Z (2019) Polynomial activation neural networks: modeling, stability analysis and coverage bp-training. Neurocomputing 359:227–240
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering Department, University of Engineering and Technology (UET) Peshawar, Peshawar, Pakistan
Nadia Masood Khan
National Center of Artificial Intelligence (NCAI), Electrical Engineering Department, University of Engineering and Technology (UET) Peshawar, Peshawar, Pakistan
Gul Muhammad Khan

Authors

Nadia Masood Khan
View author publications
You can also search for this author in PubMed Google Scholar
Gul Muhammad Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nadia Masood Khan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, N.M., Khan, G.M. Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN. Neural Process Lett 53, 227–255 (2021). https://doi.org/10.1007/s11063-020-10379-5

Download citation

Accepted: 17 October 2020
Published: 17 November 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11063-020-10379-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN

Abstract

Access this article

Similar content being viewed by others

Multi-chromosomal CGP-evolved RNN for signal reconstruction

Signal Reconstruction Using Evolvable Recurrent Neural Networks

A Machine Hearing Framework for Real-Time Streaming Analytics Using Lambda Architecture

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN

Abstract

Access this article

Similar content being viewed by others

Multi-chromosomal CGP-evolved RNN for signal reconstruction

Signal Reconstruction Using Evolvable Recurrent Neural Networks

A Machine Hearing Framework for Real-Time Streaming Analytics Using Lambda Architecture

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation