Abstract
In this paper we present a perspective on the overall process of developing classifiers for real-world classification problems. Specifically, we identify, categorize and discuss the various problem-specific factors that influence the development process. Illustrative examples are provided to demonstrate the iterative nature of the process of applying classification algorithms in practice. In addition, we present a case study of a large scale classification application using the process framework described, providing an end-to-end example of the iterative nature of the application process. The paper concludes that the process of developing classification applications for operational use involves many factors not normally considered in the typical discussion of classification models and algorithms.
Similar content being viewed by others
References
Ardanuy, P. E., Han, D. and Salomonson, V. V. (1991) The moderate resolution imaging spectrometer (MODIS) science and data system requirement. IEEE Transactions on Geoscience and Remote Sensing, 29, 75–88.
Belesley, D. A. (1986) Model selection in regression analysis, regression diagnostics and prior knowledge. International Journal of Forecasting, 2, 41–6.
Bourlard, H. A. and Morgan, N. (1994) Connectionist Speech Recognition: A Hybrid Approach. Boston, MA: Kluwer Academic Publishers.
Box, D. R. (1990) Role of models in statistical analysis. Statistical Science, 5, 169–74.
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984) Classification and Regression Trees. Belmont, CA: Wadsworth International Group.
Brodley, C. E. (1995) Recursive automatic bias selection for classifier construction. Machine Learning, 20, 63–94.
Buntine, W. and Smyth, P. (1994) Learning from data: A probabilistic framework. Tutorial notes for AAAI-94 conference. Menlo Park, CA: AAAI.
Buntine, W. (1994) Operations for learning with graphical models. Journal of Artificial Intelligence Research, 2, 159–225.
Burl, M. C., Fayyad, U. M., Perona, P., Smyth, P. and Burl, M. P. (1994) Automating the hunt for volcanoes on Venus. Proceedings of the 1994 Computer Vision and Pattern Recognition Conference (CVPR-94) pp. 302–309. Los Alamitos, CA: IEEE Computer Society Press.
Cheeseman, P. (1990) On finding the most probable model. In Shrager and Langley (eds), Computational Models of Scientific Discovery and Theory Formation. San Mateo, CA: Morgan Kaufmann.
Dawid, A. P. (1976) Properties of diagnostic data distributions. Biometrics, 32, 647–58.
Draper, B. A., Brodley, C. E. and Utgoff, P. E. (1994) Goal-directed classification using linear machine decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 888–93.
Evans, B. and Fisher, D. (1994) Overcoming process delays with decision tree induction. IEEE Expert, 9, 60–6.
Fayyad, U. M., Smyth, P., Weir, N. and Djorgovski, S. (1995) Automated analysis and exploration of large image databases. Journal of Intelligent Information Systems, 4, 7–25.
Fayyad, U. M., Piatetsky-Shapiro, G. and Smyth, P. (1996a) From data-mining to knowledge discovery: An overview. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/ MIT Press, 1–36.
Fayyad, U. M., Djorgovski, S. G. and Weir, N. (1996b) Automating the analysis and cataloging of sky surveys. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 471–94.
Fung, W. K. (1995) Diagnostics in linear discriminant analysis. Journal of American Statistics Association, 90, 952–6.
Gelman, A., Carlin, J. B., Stern, H. and Rubin, D. (1995) Bayesian Data Analysis. New York, NY: Chapman and Hall.
Hand, D. J. (1993) Artificial Intelligence Frontiers in Statistics: AI and Statistics III. London, UK: Chapman and Hall.
Hand, D. J. (1994a) Statistical strategy: Step 1. In Cheeseman and Oldford (eds), Selecting Models from Data: Artificial Intelligence and Statistics IV. New York: Springer-Verlag.
Hand, D. J. (1994b) Deconstructing statistical questions. Journal of the Royal Statistical Society, Series A, 157, 317–56.
Hastie, T. and Tibshirani, R. (1995) Discriminant adaptive nearest neighbor classification. Proceedings of the First International Conference on Knowledge Discovery and Data Mining. Montreal, Quebec: AAAI Press, 142–49.
Kodratoff, Y. (1994) Guest editorial. AI Communications, 7.
Landgrebe, D; and Biehl, L. (1994) An Introduction to Multispec. West Lafayette, IN: Purdue Research Foundation.
Langley, P. and Simon, H. A. (1995) Applications of machine learning and rule induction. Communications of the ACM, 38, 55–64.
Lee, K. F. (1989) Automatic Speech Recognition: The Development of the Sphinx System. Boston, MA: Kluwer Academic Publishers.
Lehmann, E. L. (1990) Model specification: The views of Fisher and Neyman, and later developments. Statistical Science, 5, 160–8.
Linhart, H. and Zucchini, W. (1986) Model Selection. NY: Wiley.
Matthies, L. (1992) Stereo vision for planetary rovers-stochastic modeling to near real-time implementation. The International Journal of Computer Vision, 8, 71–91.
Michie, D. (1989) Problems of computer-aided concept formation. In Quinlan (ed.), Applications of Expert Systems. Wokingham, UK: Addison-Wesley.
Nakhaeizadeh, G. (1995) What Daimler-Benz has learned as an industrial partner from the machine learning project StatLog. Working Notes of: Workshop on Applying Machine Learning in Practice: Twelfth International Machine Learning Conference pp. 22–6. Available at http://www.aic.nrl.navy. mil/aha/imlc95-workshop/notes.html.
Petsche, T., Marcantonio, A., Darken, C., Hanson, S. J., Kuhn, G. M. and Santoso, I. (in press) A neural network autoassociator for induction motor failure prediction. In Touretzky, Mozer and Hasselmo (eds), Advances in Neural Information Processing Systems 8, MIT Press.
Pettit, L. I. (1986) Diagnostics in Bayesian model choice. The Statistician, 35, 183–90.
Quinlan, J. R. (1986) Induction of decision trees. Machine Learning, 1, 81–106.
Quinlan, J. R. (1993) C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
Reich, Y., Konda, S. L., Levy, S. N., Monarch, I. A. and Subrah-manian, E. (1993) New roles for machine learning in design. Artificial Intelligence in Design, 8, 165–81.
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.
Royce, W. W. (1970) Managing the development of large software systems. Proceedings IEEE WESCON pp. 1–9.
Rudstrom, A. (1995) Applications of machine learning, (Technical Report: 95–018), Stockholm, Sweden: University of Stockholm, Department of Computer and Systems Sciences.
Schmidt, W. F., Levelt, D. F. and Duin, R. P. W. (1994) An experimental comparison of neural classifiers with ‘traditional’ classifiers. In Gelsema and Kanal (eds), Pattern Recognition in Practice IV: Multiple Paradigms, Comparative Studies, and Hybrid Systems. Amsterdam: Elsevier Science.
Schwartz, S., Wiles, J. and Philips, S. (1993) Connectionist, rule-based, and Bayesian decision aids: An empirical comparison. In Hand (ed.), Artificial Intelligence Frontiers in Statistics: AI and Statistics III. London: Chapman and Hall.
Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.
Smyth, P. (1994a) Hidden Markov monitoring for fault detection in dynamic systems. Pattern Recognition, 27, 149–64.
Smyth, P. (1994b) Markov monitoring with unknown states. IEEE Journal on Selected Areas in Communications, special issue on intelligent signal processing for communications, 12, 1600–12.
Smyth, P., Burl, M., Fayyad, U. M. and Perona, P. (1996) Knowledge discovery in large image databases: Dealing with uncertainties in ground truth. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 517–40.
Spiegelhalter, D. J., Dawid, A. P., Lauritzen, S. L. and Cowell, R. G. (1993) Bayesian analysis in expert systems (with discussion). Statistical Science, 8, 219–83.
Wang, Q. R. and Suen, C. Y. (1984) Analysis and design of a decision tree based on entropy reduction and its application to large character set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4, 406–17.
Weir, N., Fayyad, U. and Djorgovski, S. G. (1995a) Automated star/galaxy classification for POSS-II. The Astronomical Journal, 109, 2401–14.
Weir, N., Djorgovski, S. G. and Fayyad, U. (1995b) Initial galaxy counts from digitized POSS-II. The Astronomical Journal, 110, 1–20.
Weiss, S. M. and Kulikowski, C. S. (1991) Computer Systems that Learn. Palo Alto: Morgan Kaufmann.
Widrow, B., Rumelhart, D. E. and Lehr, M. A. (1994) Neural networks: Applications in industry, business, and science. Communications of the ACM, 37, 93–105.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Brodley, C.E., Smyth, P. Applying classification algorithms in practice. Statistics and Computing 7, 45–56 (1997). https://doi.org/10.1023/A:1018557312521
Issue Date:
DOI: https://doi.org/10.1023/A:1018557312521