Skip to main content
Log in

Bayesian selective combination of multiple neural networks for improving long-range predictions in nonlinear process modelling

  • Original Article
  • Published:
Neural Computing & Applications Aims and scope Submit manuscript

Abstract

A Bayesian selective combination method is proposed for combining multiple neural networks in nonlinear dynamic process modelling. Instead of using fixed combination weights, the probability of a particular network being the true model is used as the combination weight for combining that network. The prior probability is calculated using the sum of squared errors of individual networks on a sliding window covering the most recent sampling times. A nearest neighbour method is used for estimating the network error for a given input data point, which is then used in calculating the combination weights for individual networks. Forward selection and backward elimination are used to select the individual networks to be combined. In forward selection, individual networks are gradually added into the aggregated network until the aggregated network error on the original training and testing data sets cannot be further reduced. In backward elimination, all the individual networks are initially aggregated and some of the individual networks are then gradually eliminated until the aggregated network error on the original training and testing data sets cannot be further reduced. Application results demonstrate that the proposed techniques can significantly improve model generalisation and perform better than aggregating all the individual networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ahmad Z, Zhang J (2003) Improving data based nonlinear process modelling through Bayesian combination of multiple neural networks. In: Proceedings of international joint conference on neural networks (IJCNN 2003), pp 2472–2477

  2. Bishop C (1995) Neural networks for pattern recognition. Clarendon Press, Oxford

    Google Scholar 

  3. Caruana R, Lawrence S, Lee Giles C (2000) Overfitting in neural networks: backpropagation, conjugate gradient and early stopping. Neural Inf Process Syst 13:402–408

    Google Scholar 

  4. Cervantes AL, Agamennoni OE, Figueroa JL (2003) A nonlinear model predictive control system based on Wiener piecewise linear models. J Process Control 13:655–666

    Article  Google Scholar 

  5. Hagiwara K, Kuno K (2000) Regularisation learning and early stopping in linear networks. In: International joint conference on neural networks (IJCNN 2000), pp 511–516

  6. Hashem S (1997) Optimal linear combination. Neural Netw 10(4):599–614

    Article  Google Scholar 

  7. Hashem S (1999) Treating harmful collinearity in neural networks ensembles. In: Sharkey AJC (ed) Combining artificial neural nets ensemble and modular. Springer, Berlin Heidelberg New York

    Google Scholar 

  8. Jacobs RAMIJ, Nowlan SJ, Hinton GE (1991) Adaptive mixture of local expert. Neural Comput 3:79–87

    MATH  Google Scholar 

  9. Jordan MI, Jacobs RA (1994) Hierarchical mixtures of expert and the EM algorithm. Neural Comput 6:191–214

    Google Scholar 

  10. Kiartzis S, Kehagias A, Bakirtzis A, Petridis A (1997) Short term load forecasting using a Bayesian combination method. Electrical Power Energy Syst 19(3):171–177

    Article  Google Scholar 

  11. McAvoy TJ, Hsu E, Lowenthal S (1972) Dynamics of pH in controlled stirred tank reactor. Ind Chem Process Des Dev 11:68–70

    Google Scholar 

  12. Morgan N, Bourlard H (1990) Generalisation and parameter estimation in feedforward nets: some experiments. In: Touretzkey DS (ed) Advances in neural information processing system, vol 2. San Mateo, CA, pp 630–637

  13. Ohbayashi M, Hirasawa K, Toshimitsu K, Murata J, Hu J (1998) Robust cntrol for non-linear system by universal learning networks considering fuzzy criterion and second order derivatives. IEEE world congress on computational intelligence. In: IEEE international conference proceeding on neural networks, vol 2, pp 968–973

  14. Perrone MP, Cooper LN (1993) When networks disagree: ensembles methods for hybrid neural networks. In: Mammone RJ (ed) Artificial neural networks for speech and vision. Chapman and Hall, London, pp 126–142

    Google Scholar 

  15. Petridis A, Kehagias A, Petrou L, Bakirtzis A, Kiartzis S, Panagiotou H, Maslaris N (2001) A Bayesian multiple models combination method for time series prediction. J Int Robotics Syst 31:69–89

    Article  MATH  Google Scholar 

  16. Sharkey AJC (1999) Multi nets system. In: Sharkey AJC (ed) Combining artificial neural nets ensemble and modular. Springer, Berlin Heidelberg New York

  17. Sridhar DV, Bartlett EB, Seagrave RC (1996) Process modelling using stacked neural networks. AIChE J 42(9):2529–2539

    Article  Google Scholar 

  18. Sridhar DV, Bartlett EB, Seagrave RC (1999) An information theoretic approach for combining neural network process models. Neural Netw 12:915–926

    Article  Google Scholar 

  19. Wolpert DH (1992) Stacked generalisation. Neural Netw 5:241–259

    Google Scholar 

  20. Ye K (2003) Model Averaging. Int Soc Bayesian Anal Bull 10(1):12–14

    Google Scholar 

  21. Zhang J (1999) Developing robust non-linear models through bootstrap aggregated neural networks. Neurocomputing 25:93–113

    Article  Google Scholar 

  22. Zhang J (2001) Developing robust neural network models by using both dynamic and static process operating data. Ind Eng Chem Res 40:234–241

    Article  MATH  Google Scholar 

  23. Zhang J, Morris AJ, Martin EB (1998) Long-term prediction models based on mixed order locally recurrent neural networks. Comput Chem Eng 22(7–8):1051–1063

    Article  Google Scholar 

  24. Zhang J, Morris AJ, Martin EB, Kipaerissides C (1998) Prediction of polymer quality in batch polymerisation reactors using robust neural networks. Chem Eng J 69:135–143

    Article  Google Scholar 

  25. Zhang J, Martin EB, Morris AJ, Kiparissides C (1997) Inferential estimation of polymer quality using stacked neural networks. Comput Chem Eng 21:s1025–s1030

    Google Scholar 

Download references

Acknowledgements

This work was supported by University Science Malaysia (for Z. Ahmad) and UK EPSRC through the Grant GR/R10875 (for J. Zhang). The authors also thank the anonymous reviewers for their constructive comments which helped to improve the quality and presentation of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmad, Z., Zhang, J. Bayesian selective combination of multiple neural networks for improving long-range predictions in nonlinear process modelling. Neural Comput & Applic 14, 78–87 (2005). https://doi.org/10.1007/s00521-004-0451-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-004-0451-y

Keywords

Navigation