Abstract
We describe an ensemble learning approach that accurately learns from data that has been partitioned according to the arbitrary spatial requirements of a large-scale simulation wherein classifiers may be trained only on the data local to a given partition. As a result, the class statistics can vary from partition to partition; some classes may even be missing from some partitions. In order to learn from such data, we combine a fast ensemble learning algorithm with Bayesian decision theory to generate an accurate predictive model of the simulation data. Results from a simulation of an impactor bar crushing a storage canister and from region recognition in face images show that regions of interest are successfully identified.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hall, L.O., Bhadoria, D., Bowyer, K.W.: Learning a model from spatially disjoint data. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (October 2004)
National Nuclear Security Administration in collaboration with Sandia, Lawrence Livermore, and Los Alamos National Laboratories, http://www.sandia.gov/NNSA/ASC/
Lee, B.S., Snapp, R.R., Musick, R.: Toward a query language on simulation mesh data: an object oriented approach. In: Proceedings of the International Conference on Database Systems for Advanced Applications, Hong Kong (April 2001)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Banfield, R.E., Hall, L.O., Bowyer, K.W., Bhadoria, D., Kegelmeyer, W.P., Eschrich, S.: A comparison of ensemble creation techniques. In: The Fifth International Conference on Multiple Classifier Systems, Cagliari, Italy (June 2004)
Cussens, J.: Bayes and pseudo-Bayes estimates of conditional probabilities and their reliabilities. In: Proceedings of the European Conference on Machine Learning (1993)
Zadrozny, B., Elkin, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the Seventh International Conference of Knowledge Discovery and Data Mining (2001)
Chawla, N.V.: C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Workshop on Learning from Imbalanced Data Sets II (2003)
Chawla, N.V., Moore Jr., T.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Springer, C.: Distributed learning with bagging-like performance. Pattern Recognition Letters 24(1-3), 455–471 (2003)
The facial recognition technology (FERET) Database, http://www.itl.nist.gov/iad/humanid/feret/
Muhlbaier, M., Topalis, A., Polikar, R.: Learn++.MT: A new approach to incremental learning. In: The Fifth International Conference on Multiple Classifier Systems, Cagliari, Italy (June 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P. (2005). Ensembles of Classifiers from Spatially Disjoint Data. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2005. Lecture Notes in Computer Science, vol 3541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494683_20
Download citation
DOI: https://doi.org/10.1007/11494683_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26306-7
Online ISBN: 978-3-540-31578-0
eBook Packages: Computer ScienceComputer Science (R0)