Abstract
We have investigated the performance of a generalisation error predictor, Gest, in the context of error correcting output coding ensembles based on multi-layer perceptrons. An experimental evaluation on benchmark datasets with added classification noise shows that over-fitting can be detected and a comparison is made with the Q measure of ensemble diversity. Each dichotomy associated with a column of an ECOC code matrix is presented with a bootstrap sample of the training set. Gest uses the out-of-bootstrap samples to efficiently estimate the mean column error for the independent test set and hence the test error. This estimate can then be used select a suitable complexity for the base classifiers in the ensemble.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Banfield, R., et al.: A New Ensemble Diversity Measure Applied to Thinning Ensembles. In: Proc. MCS 4th International Workshop. Springer, Guildford (2003)
Berger, B.: Error-correcting output coding for text classification. In: IJCAI 1999, Workshop on machine learning for information filtering, Stockholm, Sweden (1999)
Bishop, C.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)
Breiman, L.: Out-of-bag estimation, Technical Report No. 421, University of California Berkeley (1994)
Bylander, T.: Estimating Generalization Error on Two-Class Datasets Using Out-Of-Bag Estimates. Machine Learning 48, 287–297 (2002)
Efron, B.: Bootstrap Methods: another look at the jackknife. Ann. Statistics 7, 1–26 (1979)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. In: Monographs on Statistics and Applied Probability, vol. 57, p. 47. Chapman and Hall, Boca Raton (1993)
Dietterich, T., Bakiri, G.: Solving Multi-class Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Research 2, 236–286 (1995)
Ghani, R.: Using error-correcting codes for text classification. In: Proceedings of the 17th International Conference on Machine Learning, pp. 303–310. Morgan Kaufmann, San Francisco (2000)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)
James, G., Hastie, T.: The error coding method and PICTs. Computation and Graphical Statistics 7, 377–387 (1998)
James, G.: Majority Vote Classifiers - Theory and applications, PhD. Dissertation, Stanford University (1998)
Kuncheva, L., Whitaker, C.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. In: Machine Learning, vol. 51(2), pp. 187–207. Kluwer Academic Publishers, Dordrecht (2002)
Paass, G.: Assessing and Improving Neural Network Predictions by the Bootstrap Algorithm. In: Advances in Neural Information Processing Systems, vol. 5, pp. 196–203. Morgan Kaufman, San Francisco (1993)
Rifkin, R., Klautau, A.: In Defence of One-Vs-All Classification. Journal of Machine Learning Research 5, 101–141 (2004)
Windeatt, T.: Vote Counting Measures for Ensemble Classifiers: Pattern Recognition 36, Pergamon, pp. 2743–2756 (2003)
Windeatt, T., Ghaderi, R.: Coding and Decoding Strategies for Multi-class Learning Problems. Information Fusion 4(1), 11–21 (2003)
Winston, W.: Operations Research – Applications and Algorithms, 3rd edn., ITP 1994, p. 628 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Prior, M., Windeatt, T. (2005). Over-Fitting in Ensembles of Neural Network Classifiers Within ECOC Frameworks. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2005. Lecture Notes in Computer Science, vol 3541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494683_29
Download citation
DOI: https://doi.org/10.1007/11494683_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26306-7
Online ISBN: 978-3-540-31578-0
eBook Packages: Computer ScienceComputer Science (R0)