Skip to main content
Log in

Architecture for development of adaptive on-line prediction models

  • Special Issue - Regular Research Paper
  • Published:
Memetic Computing Aims and scope Submit manuscript

An Erratum to this article was published on 01 February 2013

Abstract

This work presents an architecture for the development of on-line prediction models. The architecture defines unified modular environment based on three concepts from machine learning, these are: (i) ensemble methods, (ii) local learning, and (iii) meta learning. The three concepts are organised in a three layer hierarchy within the architecture. For the actual prediction making any data-driven predictive method such as artificial neural network, support vector machines, etc. can be implemented and plugged in. In addition to the predictive methods, data pre-processing methods can also be implemented as plug-ins. Models developed according to the architecture can be trained and operated in different modes. With regard to the training, the architecture supports the building of initial models based on a batch of training data, but if this data is not available the models can also be trained in incremental mode. In a scenario where correct target values are (occasionally) available during the run-time, the architecture supports life-long learning by providing several adaptation mechanisms across the three hierarchical levels. In order to demonstrate its practicality, we show how the issues of current soft sensor development and maintenance can be effectively dealt with by using the architecture as a construction plan for the development of adaptive soft sensing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dote Y, Ovaska SJ (2001) Industrial applications of soft computing: a review. Proc IEEE 89(9): 1243–1265

    Article  Google Scholar 

  2. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2): 241–259

    Article  MathSciNet  Google Scholar 

  3. Perrone MP, Cooper LN (1993) When networks disagree: ensemble methods for hybrid neural networks. Neural Netw Speech Image Proc 126–142

  4. Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, New Jersey

    Book  MATH  Google Scholar 

  5. Valentini G, Masulli F (2002) Ensembles of learning machines. In: 13th Italian workshop on neural nets, vol 2486, Lecture Notes in Computer Sciences. Springer, Berlin, pp 3–22

  6. Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation and active learning. Adv Neural Inf Proc Syst (7):231–238

  7. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3): 226–239

    Article  Google Scholar 

  8. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1): 119–139

    Article  MathSciNet  MATH  Google Scholar 

  9. Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11: 169–198

    MATH  Google Scholar 

  10. Bauer E, Kohavi RON (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36: 105–139

    Article  Google Scholar 

  11. Ruta D, Gabrys B (2000) An overview of classifier fusion methods. Comput Inf Syst 7(1): 1–10

    Google Scholar 

  12. Ruta D, Gabrys B (2005) Classifier selection for majority voting. Inf Fusion 6(1): 63–81

    Article  Google Scholar 

  13. Gabrys B (2004) Learning hybrid neuro-fuzzy classiffer models from data: to combine or not to combine. Fuzzy Sets Syst 147: 39–56

    Article  MathSciNet  MATH  Google Scholar 

  14. Gabrys B, Ruta D (2006) Genetic algorithms in classifier fusion. Appl Soft Comput 6(4): 337–347

    Article  Google Scholar 

  15. Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4(1): 1–58

    Article  Google Scholar 

  16. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, USA

    Google Scholar 

  17. Jacobs R (1997) Bias/variance analyses of mixtures-of-experts architectures. Neural Comput 9(2): 369–383

    Article  MATH  Google Scholar 

  18. Chandra A, Yao X (2006) Evolving hybrid ensembles of learning machines for better generalisation. Neurocomputing 69(7–9): 686–700

    Article  Google Scholar 

  19. Poggio T, Girosi F (1990) Regularization algorithms for learning that are equivalent to multilayer networks. Science 247(4945): 978–982

    Article  MathSciNet  MATH  Google Scholar 

  20. Platt J (1991) A resource-allocating network for function interpolation. Neural Comput 3(2): 213–225

    Article  MathSciNet  Google Scholar 

  21. Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6): 888–900

    Article  Google Scholar 

  22. Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8): 2047–2084

    Article  Google Scholar 

  23. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11(1): 11–73

    Article  Google Scholar 

  24. French R (1999) Catastrophic forgetting in connectionist networks: Causes, consequences and solutions. Trends Cogn Sci 3(4): 128–135

    Article  MathSciNet  Google Scholar 

  25. Vijayakumar S, D’Souza A, Schaal S (2005) Incremental online learning in high dimensions. Neural Comput 17(12): 2602–2634

    Article  MathSciNet  Google Scholar 

  26. Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2): 77–95

    Article  Google Scholar 

  27. Aha DW (1992) Generalizing from case studies: a case study. In: Proceedings of the ninth international conference on machine learning, pp 1–10

  28. Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: Proceedings of the Seventeenth international conference on machine learning, vol 951. Morgan Kaufmann, Menlo Park, pp 743–750

  29. Kalousis A, Hilario M (2001) Model selection via meta-learning: A comparative study. Int J Artif Intell Tools 10(4): 525–554

    Article  Google Scholar 

  30. Peng Y, Flach PA, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. Lect Notes Comp Sci 2534: 141–152

    Article  Google Scholar 

  31. Wong RO (1995) Use, disuse, and growth of the brain. In: Proceedings of the National Academy of Sciences of the United States of America. vol 92, National Academy of Sciences, USA, pp 1797–1799

  32. Kadlec P, Gabrys B, Strandt S (2009) Data-driven soft sensor in the process industry. Comput Chem Eng 33(4): 795–814

    Article  Google Scholar 

  33. Kadlec P, Gabrys B (2008) Soft sensor based on adaptive local learning. In: Coghill MK, Kasabov N, George (eds) Proceedings of the international conference on neural information processing, vol 5506, Lecture Notes in Computer Science. Auckland, New Zealand, Springer, Berlin, pp 1172–1179

  34. Kasabov N (2001) Evolving fuzzy neural networks for supervised/unsupervised online knowledge-based learning. IEEE Trans Syst Man Cybern B 31(6): 902–918

    Article  Google Scholar 

  35. Kasabov N, Song Q (2002) Denfis: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans Fuzzy Syst 10(2): 144–154

    Article  Google Scholar 

  36. Angelov P, Filev DP (2004) Flexible models with evolving structure. Int J Intell Syst 19(4): 327–340

    Article  MATH  Google Scholar 

  37. Angelov P, Kasabov N (2005) Evolving computational intelligence systems. In: IEEE workshop on genetic and fuzzy systems, Grenada, Spain

  38. Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1): 5–20

    Article  Google Scholar 

  39. Jacobs R (1991) Adaptive mixtures of local experts. Neural Comput 3(1): 79–87

    Article  Google Scholar 

  40. Ruta D, Gabrys B (2007) Neural network ensembles for time series prediction. In: International joint conference on neural networks 2007. IEEE Computer Society, Orlando, pp 1204–1209

  41. Riedel S, Gabrys B (2007) Dynamic pooling for the combination of forecasts generated using multi level learning. In: International joint conference on neural networks 2007, IEEE Computer Society, pp 454–459

  42. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1(14): 281–297

  43. Angelov P, Filev D (2005) Simplets: a simplified method for learning evolving takagi-sugeno fuzzy models. In: The 14th IEEE international conference on fuzzy systems, IEEE, pp 1068–1073

  44. Angelov P, Zhou X (2006) Evolving fuzzy systems from data streams in real-time. In: International symposium on evolving fuzzy systems 2006, pp 29–35

  45. Chung PJ, Bohme JF (2003) Recursive em algorithm with adaptive step size. In: Seventh international symposium on signal processing and its applications, vol 2, IEEE, pp 519–522

  46. Gabrys B, Bargiela A (2000) General fuzzy min-max neural network for clustering and classification. IEEE Trans Neural Netw 11(3): 769–783

    Article  Google Scholar 

  47. Gabrys B, Petrakieva L (2004) Combining labelled and unlabelled data in the design of pattern classification systems. Int J Approx Reason 35(3): 251–273

    Article  MathSciNet  MATH  Google Scholar 

  48. Neal RM, Hinton GE (1999) A view of the em algorithm that justifies incremental, sparse, and other variants. In: Learning in graphical models, vol 89. MIT Press, Cambridge, pp 355–368

  49. Zivkovic Z, van der Heijden F (2004) Recursive unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 26(5): 651–656

    Article  Google Scholar 

  50. Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML™ 2000 workshop on Meta-Learning, pp 109–117

  51. Giraud-Carrier C (1998) Beyond predictive accuracy: what? In: Proceedings of the ECML-98 workshop on upgrading learning to meta-level, pp 78–85

  52. Bouchachia A (2006) Incremental learning by decomposition. In: ICMLA ’06: Proceedings of the 5th international conference on machine learning and applications, IEEE Computer Society, pp 63–68

  53. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Advances in artificial intelligence SBIA 2004: 17th Brazilian, vol 3171, pp 286–295

  54. Maloof MA, Michalski RS (2000) Selecting examples for partial memory learning. Mach Learn 41(1): 27–52

    Article  Google Scholar 

  55. Klinkenberg R (2004) Learning drifting concepts: Example selection vs. example weighting. Intell Data Anal 8(3): 281–300

    Google Scholar 

  56. Koychev I (2000) Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 workshop current issues in spatio-temporal reasoning, pp 101–106

  57. Croux C, Ruiz-Gazen A (2005) High breakdown estimators for principal components: the projection-pursuit approach revisited. J Multivar Anal 95(1): 206–226

    Article  MathSciNet  MATH  Google Scholar 

  58. Dobson AJ (2002) An introduction to generalized linear models. Chapman and Hall, London

    MATH  Google Scholar 

  59. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  60. Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the conference on uncertainty in artificial intelligence, pp 249–256

  61. Gosset WS (1908) The probable error of a mean. Biometrika 6(1): 1–25

    Google Scholar 

  62. Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3): 1065–1076

    Article  MathSciNet  MATH  Google Scholar 

  63. Klanke S, Vijayakumar S, Schaal S (2008) A library for locally weighted projection regression. J Mach Learn Res 9: 623–626

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Petr Kadlec.

Additional information

An erratum to this article can be found online at http://dx.doi.org/10.1007/s12293-013-0106-6.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kadlec, P., Gabrys, B. Architecture for development of adaptive on-line prediction models. Memetic Comp. 1, 241–269 (2009). https://doi.org/10.1007/s12293-009-0017-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12293-009-0017-8

Keywords

Navigation