Skip to main content
Log in

A review of adaptive online learning for artificial neural networks

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

In real applications learning algorithms have to address several issues such as, huge amount of data, samples which arrive continuously and underlying data generation processes that evolve over time. Classical learning is not always appropriate to work in these environments since independent and indentically distributed data are assumed. Taking into account the requirements of the learning process, systems should be able to modify both their structures and their parameters. In this survey, our aim is to review the developed methodologies for adaptive learning with artificial neural networks, analyzing the strategies that have been traditionally applied over the years. We focus on sequential learning, the handling of the concept drift problem and the determination of the network structure. Despite the research in this field, there are currently no standard methods to deal with these environments and diverse issues remain an open problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Alippi C, Roveri M (2008) Just-in-time adaptive classifiers—part II: designing the classifier. IEEE Trans Neural Netw 19(12):2053–2064

    Article  Google Scholar 

  • Alippi C, Boracchi G, Roveri M (2011) A just-in-time adaptive classification systems based on the intersection of confidence intervals rule. Neural Netw 24(8):791–800

    Article  Google Scholar 

  • Alippi C, Boracchi G, Roveri M (2012) Just-in-time ensemble of classifiers. In: Proceedings of international joint conference on neural networks (IJCNN’12), pp 1–8

  • Alippi C, Boracchi G, Roveri M (2013) Just-in-time classifiers for recurrent concepts. IEEE Trans Neural Netw Learn Syst 24(4):620–634

    Article  Google Scholar 

  • Augasta MG, Kathirvalavakumar T (2011) A novel pruning algorithm for optimizing feedforward neural network of classification problems. Neural Process Lett 34:241–258

    Article  Google Scholar 

  • Augasta MG, Kathirvalavakumar T (2013) Pruning algorithms of neural networks a comparative study. Cent Eur J Comp Sci 3(3):105–115

    Google Scholar 

  • Bauer F, Lukas MA (2011) Comparing parameter choice methods for regularization of ill-posed problems. Math Comput Simul 81:1795–1841

    Article  MathSciNet  MATH  Google Scholar 

  • Baum EB, Haussler D (1989) What size net gives valid generalization? Neural Comput 1:151–160

    Article  Google Scholar 

  • Beale EM (1972) A derivation of conjugate gradients, numerical methods for nonlinear optimization. Academic Press, New York

    MATH  Google Scholar 

  • Bertini Junior JR, Nicoletti MC (2016) Enhancing constructive neural networks performance using functionally expanded input data. J Artif Intell Soft Comput Res 6(2):119–131

    Article  Google Scholar 

  • Bifet A, Gavalda R (2006) Kalman filters and adaptive windows for learning in data streams. In: Proceedings of international conference discovery science, pp 29–40

  • Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Proceddings of SIAM international conference on data mining (SDM 2007)

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Bondarenko A, Borisov A, Aleksejeva L (2015) Neurons vs weights pruning in artificial neural networks. In: Proceedings of the 10th international scientific and practical conference, vol III, pp 22–28

  • Bottou L (2004) Stochastic learning. Adv Lect Mach Learn Lect Notes Artif Intell 3176:146–168

    Article  MATH  Google Scholar 

  • Bouchachia A (2011) Incremental learning with multi-level adaptation. Neurocomputing 74(11):1785–1799

    Article  Google Scholar 

  • Bouchachia A, Gabrys B, Sahel Z (2007) Overview of some incremental learning algorithms. In: Proceedings of the IEEE international conference on fuzzy systems, pp 1–6

  • Brzezinski D, Stephanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Netw Learn Syst 25(1):81–94

    Article  Google Scholar 

  • Camargo LS, Yoneyama T (2001) Specification of training sets and the number of hidden neurons for multilayer perceptrons. Neural Comput 13(12):2673–2680

    Article  MATH  Google Scholar 

  • Chentouf R, Jutten C (1996) DWINA: depth and width incremental neural algorithm. In: Proceedings of the IEEE international conference on neural networks, pp 153–158

  • Cun YL, Denker JS, Solla SA (1990) Optimal brain damage. Adv Neural Inf Process 2:598–605

    Google Scholar 

  • de Jesus Rubio J, Perez-Cruz H (2014) Evolving intelligent system for the modelling of nonlinear systems with dead-zone input. Appl Soft Comput 14(Part B):289–304

    Article  Google Scholar 

  • Ditzler G, Rosen G, Polikar R (2013) Discounted expert weighting for concept drift. In: IEEE symposium on computational intelligence in dynamic and uncertain environments (CIDUE’13), pp 61–67

  • Ditzler G, Rosen G, Polikar R (2014) Domain adaptation bounds for multiple expert systems under concept drift. In: International joint conference on neural networks (IJCNN’14), pp 595–601

  • Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25

    Article  Google Scholar 

  • Egrioglu E, Aladag CH, Gunay S (2008) A new model selection strategy in artificial neural networks. Appl Math Comput 195:591–597

    MathSciNet  MATH  Google Scholar 

  • Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531

    Article  Google Scholar 

  • Engel Y, Mannor S, Meir R (2004) The kernel recursive least-squares algorithm. IEEE Trans Signal Process 52(8):2275–2285

    Article  MathSciNet  MATH  Google Scholar 

  • Esposito F, Ferilli S, Fanizzi N, Basile T, Mauro MD (2004) Incremental learning and concept drift in INTHELEX. Intell Data Anal 8(3):213–237

    Google Scholar 

  • Fan Q, Zurada JM, Wu W (2014) Convergence of online gradient method for feedforward neural networks with smoothing \(l_{1/2}\) regularization penalty. Neural Netw 50:72–78

    Article  MATH  Google Scholar 

  • Fritzke B (1994) Growing cell structures a self-organizing network for unsupervised and supervised learning. Neural Netw 7(9):1441–1460

    Article  Google Scholar 

  • Gama J (2010) Knowledge discovery from data streams. Chapman and Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Proceedings in adavances artificial intelligence (SBIA 2004), pp 586–295

  • Gama J, Sebastiao R, Pereira Rodrigues P (2013) On evaluating stream learning algorithms. Mach Learn 90(3):317–346

    Article  MathSciNet  MATH  Google Scholar 

  • Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37

    Article  MATH  Google Scholar 

  • García-Pedrajas N, Ortiz-Boyer D (2007) A cooperative constructive method for neural networks for pattern recognition. Pattern Recognit 40(1):80–98

    Article  MATH  Google Scholar 

  • Ghazikhani A, Monsefi R, Sadoghi Yazdi H (2014) Online neural network model for non-stationary and imbalanced data stream classification. Int J Mach Learn Cybernet 5(1):51–62

    Article  Google Scholar 

  • Goodwin GC, Sin KS (1984) Adaptive filtering, prediction and control. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Gregorcic G, Lightbody G (2007) Local model network identification with gaussian processes. IEEE Trans Neural Netw 18:1404–1423

    Article  Google Scholar 

  • Grossberg S (1987) Competitive learning: from interactive activation to adaptive resonance. Cogn Sci 11(1):23–63

    Article  Google Scholar 

  • Hagan MT, Menhaj M (1994) Training feedforward networks with the marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993

    Article  Google Scholar 

  • Han H-G, Qiao J-F (2013) A structure optimisation algorithm for feedforward neural network construction. Neurocomputing 99:347–357

    Article  Google Scholar 

  • Hassibi B, Stork DG (1993) Second-order derivatives for network pruning: optimal brain surgeon. Adv Neural Inf Process Syst 5:164–171

    Google Scholar 

  • Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  • Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  Google Scholar 

  • Hsu CF (2008) Adaptive growing-and-pruning neural network control for a linear piezoelectric ceramic motor. Eng Appl Artif Intell 21(8):1153–1163

    Article  Google Scholar 

  • Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468

    Article  Google Scholar 

  • Huang DS, Du JX (2008) A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks. IEEE Trans Neural Netw 19(12):2099–2115

    Article  Google Scholar 

  • Huang GB, Saratchandran P, Sundararajan N (2005) Generalised growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Trans Neural Netw 16(1):57–67

    Article  Google Scholar 

  • Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theorey and applications. NeuroComputing 70:489–501

    Article  Google Scholar 

  • Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A new adaptive merging and growing algorithm for designing artificial neural networks. IEEE Trans Syst Man Cybern 39(3):705–722

    Article  Google Scholar 

  • Jain LC, Seera M, Lim CP, Balasubramaniam P (2014) A review of online learning in supervised neural networks. Neural Comput Appl 25:491–509

    Article  Google Scholar 

  • Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281–300

    Google Scholar 

  • Krempl G, Žliobaitė I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, Stefanowski J (2014) Open challenges for data stream mining research. SIGKDD Explor 16(1):1–10

    Article  Google Scholar 

  • Kubat M, Gamma J, Utgoff P (2004) Incremental learning and concept drift, editor’s introduction: guest-editorial. Intell Data Anal 8(3):211–212

    Google Scholar 

  • Kuncheva L, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872

    Google Scholar 

  • Kwok T-Y, Yeung D-Y (1997) Constructive algorihtms for structure learning in feedforward neural networks for regresion problems. IEEE Trans Neural Netw 8(3):630–645

    Article  Google Scholar 

  • Lauret P, Fock E, Mara TA (2006) A node pruning algorithm based on a fourier amplitude sensitivity test method. IEEE Trans Neural Netw 17(2):273–293

    Article  Google Scholar 

  • LeCunn Y, Bottou L, Orr G, Müller K-R (1998) Efficient backprop. Neural Netw Tricks Trade 1524:9–50

    Article  Google Scholar 

  • Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q J Appl Math 2(2):164–168

    Article  MathSciNet  MATH  Google Scholar 

  • Liang N-Y, Huang G-B (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423

    Article  Google Scholar 

  • Liu Y, Starzyk A, Zhu Z (2007) Optimizing number of hidden neurons in neural networks. In: Proceedings of the artificial intelligence and applications (AIAP’07), pp 121–126

  • Liu W, Pokharel PP, Principe JC (2008a) The kernel least-mean-square algorithm. IEEE Trans Signal Process 56(2):543–554

    Article  MathSciNet  Google Scholar 

  • Liu Y, Starzyk A, Zhu Z (2008b) Optimized approximation algorithm in neural networks without overfitting. IEEE Trans Neural Netw 19(6):983–995

    Article  Google Scholar 

  • Liu W, Park I, Principe JC (2009) Extended kernel recursive least squares algorithm. IEEE Trans Signal Process 57(10):3801–3814

    Article  MathSciNet  Google Scholar 

  • Ma L, Khorasani K (2003) A new strategy for adaptively constructing multilayer feedforward neural networks. Neurocomputing 51:361–385

    Article  Google Scholar 

  • Marquardt DW (1963) An algorithm for least-squares estimation of non-linear parameters. J Soc Ind Appl Math 11(2):431–441

    Article  MATH  Google Scholar 

  • Marques Silva A, Caminhasa W, Lemosa A, Gomide F (2014) A fast learning algorithm for evolving neo-fuzzy neuron. Appl Soft Comput 14(B):194–209

    Article  Google Scholar 

  • Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74:1800–1808

    Article  Google Scholar 

  • Minku LL, White AP, Yao X (2010) The impact of diversity on on-line ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22:730–742

    Article  Google Scholar 

  • Minku L, Yao X (2012) Ddd: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619–633

    Article  Google Scholar 

  • Moller M (1993) Supervised learning on large redundant training sets. Int J Neural Syst 4(1):15–25

    Article  Google Scholar 

  • Nagumo J, Noda A (1967) A learning method for system identification. IEEE Trans Autom Control 12:283–287

    Article  Google Scholar 

  • Narasimha PL, Delashmit WH, Manry MT, Li J, Maldonado F (2008) An integrated growing-pruning method for feedforward network training. Neurocomputing 71(13–15):2831–2847

    Article  Google Scholar 

  • Ortega-Zamorano F, Jerez J, Urda D, Luque-Baena R, Franco L (2014) Fpga implementation of the C-MANTEC neural networks constructive algorithm. IEEE Trans Ind Inf 10(2):1154–1161

    Article  Google Scholar 

  • Ortega-Zamorano F, Jerez J, Jurez G, Franco L (2015) Fpga implmentation comparison between c-mantec and back propagation. In: International workshop on artificial neural networks (IWANN 2015), vol Part II of LNCS, pp 197–208

  • Peng H, Mou L, Li G, Chen Y, Lu Y, Jin Z (2015) A comparative study on regularization strategies for embedding-based neural networks. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP 2015), pp 2106–2111

  • Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B, Martínez-Rego D (2013) An online learning algorithm for adaptable topologies of neural networks. Expert Syst Appl 40:7294–7304

    Article  Google Scholar 

  • Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2014) Self-adaptive topology neural network for online incremental learning. In: Proceedings of the international conference on agents and artificial intelligence (ICAART’14), pp 94–101

  • Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2015) Adaptive neural topology based on Vapnik–Chervonenkis dimension. In: Lecture Notes in Artificial Intelligence (in press)

  • Plavidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) Landa perceptron: an adaptive classifier for data streams. Pattern Recogn 44(1):78–96

    Article  MATH  Google Scholar 

  • Qiao JF, Han HG (2010) A repair algorithm for RBF neural network and its application to chemical oxygen demand modeling. Int J Neural Syst 20(1):63–74

    Article  Google Scholar 

  • Qiao J, Zhang Z, Bo Y (2014) An online self-adaptive modular neural network for time-varying systems. Neurocomputing 125:7–16

    Article  Google Scholar 

  • Qiao J, Li F, Han H, Li W (2016) Constructive algorithm for fully connected cascade feedforward neural networks. Neurocomputing 182:154–164

    Article  Google Scholar 

  • Qi M, Zhang GP (2001) An investigation of model selection criteria for neural network time series forecasting. Eur J Oper Res 132:666–680

    Article  MATH  Google Scholar 

  • Reitermanová Z (2008) Feedforward neural networks architecture optimization and knowledge extraction. In: Proceedings of week of doctoral students (WDS 2008), vol Part I, pp 159–164

  • Robins A (2004) Sequential learning in neural networks: a review and a discussion of pseudorehearsal based methods. Intell Data Anal 8(3):301–322

    Google Scholar 

  • Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408

    Article  Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations of back-propagation errors. Nature 323:533–536

    Article  MATH  Google Scholar 

  • Scarselli F, Tsoi AC (1998) Universal approximation using feedforward neural networks a surver of some existing methods and some new results. Neural Netw 11(1):15–37

    Article  Google Scholar 

  • Shao HM, Zheng GF (2011) Boundedness and convergence of online gradient method with penalty and momentum. Neurocomputing 74:765–770

    Article  Google Scholar 

  • Sharma SK, Chandra P (2010) Constructive neural networks: a review. Int J Eng Sci Technol 2(12):7847–7855

    Google Scholar 

  • Subirats JL, Franco L, Jerez JM (2012) C-mantec: a novel constructive neural network algorithm incorporating competition between neurons. Neural Netw 26:131–140

    Article  Google Scholar 

  • Teoh EJ, Tan KC, Xiang C (2006) Estimating the number of hidden neurons in a feedforward network using the singular value decomposition. IEEE Trans Neural Netw 17(6):1623–1629

    Article  Google Scholar 

  • Thomas P, Suhner MC (2015) A new multilayer perceptron pruning algorithm for classification and regression applications. Neural Process Lett 42(2):437–458

    Article  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Wang C, Hill DJ (2006) Learning from neural control. IEEE Trans Neural Netw 17(1):30–46

    Article  Google Scholar 

  • Wang J, Yang G, Liu S, Zurada JM (2015a) Convergence analysis of multilayer feedforward networks trained with penalty terms: a review. J Appl Comput Sci Methods 7(2):89–103

    Article  Google Scholar 

  • Wang J-H, Wang H-Y, Chen Y-L, Liu C-M (2015b) A constructive algorithm for unsupervised learning with incremental neural network. J Appl Res Technol 13:188–196

    Article  Google Scholar 

  • Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:69–101

    Google Scholar 

  • Widrow E, Hoff ME (1960) Adaptive switching circuits. In: Proceedings of IRE WESCON convention, pp 96–104

  • Wu W, Fan QW, Zurada JM, Wang J, Yang DK, Liu Y (2014) Batch gradient method with smoothing regularization for training of feedforward neural networks. Neural Netw 50:72–78

    Article  MATH  Google Scholar 

  • Xu J, Ho DWC (2006) A new training and pruning algorithm based on node dependence and jacobian rank deficiency. Neurocomputing 70(1–3):544–558

    Article  Google Scholar 

  • Yamakawa T, Uchino E, Miki T, Kusabagi H (1992) A neofuzzy neuron and its applications to system identification and predictions to system behavior. Proc Int Conf Fuzzy Logic Neural Netw 1:477–484

    Google Scholar 

  • Ye Y, Squartini S, Piazza F (2013) Online sequential extreme learning machine in nonstationary environments. Neurocomputing 116:94–101

    Article  Google Scholar 

  • Yoan M, Sorjamaa A, Bas P, Simula O, Jutten C, Lendasse A (2010) OP-ELM: optimally pruned extreme learning machine. IEEE Trans Neural Netw 21(1):158–162

    Article  Google Scholar 

  • Yu X, Chen QF (2012) Convergence of gradient method with penalty for ridge polynomial neural network. Neurocomputing 97:405–409

    Article  Google Scholar 

  • Zeng W, Wang C (2015) Classification of neurodegenerative diseases using gait dynamics via deterministic learning. Inf Sci 317(C):246–258

    Article  Google Scholar 

  • Zeng W, Wang C, Yang F (2014) Silhouette-based gait recognition via deterministic learning. Pattern Recogn 47(11):3568–3584

    Article  Google Scholar 

  • Zeng W, Wang Q, Liu F, Wang Y (2016) Learning from adaptive neural network output feedback control of a unicycle-type mobile robot. ISA Trans 61:337–347

    Article  Google Scholar 

  • Zhang HS, Wu W, Liu F, Yao MC (2009) Boundedness and convergence of online gadient method with penalty for feedforward neural networks. IEEE Trans Neural Netw 20(6):1050–1054

    Article  Google Scholar 

  • Zhang R, Lan Y, Huang GB, Xu ZB (2012) Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Trans Neural Netw Learn Syst 23(2):365–371

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank support for this work from the Xunta de Galicia (Grant code GRC2014/035) and the Secretaría de Estado de Investigación of the Spanish Government (Grant code TIN2015-65069), all partially supported by the European Union ERDF funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beatriz Pérez-Sánchez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pérez-Sánchez, B., Fontenla-Romero, O. & Guijarro-Berdiñas, B. A review of adaptive online learning for artificial neural networks. Artif Intell Rev 49, 281–299 (2018). https://doi.org/10.1007/s10462-016-9526-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-016-9526-2

Keywords

Navigation