Skip to main content

Simultaneous Partitioning of Input and Class Variables for Supervised Classification Problems with Many Classes

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 398))

  • 427 Accesses

Abstract

In the data preparation phase of data mining, supervised discretization and value grouping methods have numerous applications: interpretation, conditional density estimation, filter selection of input variables, variable recoding for classification methods. These methods usually assume a small number of classes, typically less than ten, and reach their limit in case of too many classes. In this paper, we extend discretization and value grouping methods, based on the partitioning of both the input and class variables. The best joint partitioning is searched by maximizing a Bayesian model selection criterion. We show how to exploit this preprocessing method as a preparation for the naive Bayes classifier. Extensive experiments demonstrate the benefits of the approach in the case of hundreds of classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramowitz, M., Stegun, I.: Handbook of mathematical functions. Dover Publications Inc., New York (1970)

    Google Scholar 

  2. Allwein, E., Schapire, R., Singer, Y.: Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2002)

    MathSciNet  Google Scholar 

  3. Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

  4. Bock, H.: Simultaneous clustering of objects and variables. In: Diday, E. (ed.) Analyse des Données et Informatique, pp. 187–203. INRIA (1979)

    Google Scholar 

  5. Boullé, M.: A Bayes optimal approach for partitioning the values of categorical attributes. Journal of Machine Learning Research 6, 1431–1452 (2005)

    MATH  Google Scholar 

  6. Boullé, M.: MODL: a Bayes optimal discretization method for continuous attributes. Machine Learning 65(1), 131–165 (2006)

    Article  Google Scholar 

  7. Boullé, M.: Optimum simultaneous discretization with data grid models in supervised classification: a bayesian model selection approach. Advances in Data Analysis and Classification 3(1), 39–61 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  8. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International, California (1984)

    MATH  Google Scholar 

  9. Catlett, J.: On Changing Continuous Attributes into Ordered Discrete Attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 87–102. Springer, Heidelberg (1991)

    Google Scholar 

  10. Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)

    MATH  Google Scholar 

  11. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the 12th International Conference on Machine Learning, pp. 194–202. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  12. Escalera, S., Pujol, O., Radeva, P.: On the decoding process in ternary error-correcting output codes. IEEE Transactions in Pattern Analysis and Machine Intelligence 32(1), 120–134 (2010)

    Article  Google Scholar 

  13. Fayyad, U., Irani, K.: On the handling of continuous-valued attributes in decision tree generation. Machine Learning 8, 87–102 (1992)

    MATH  Google Scholar 

  14. Hand, D., Yu, K.: Idiot’s bayes? not so stupid after all? International Statistical Review 69(3), 385–399 (2001)

    Article  MATH  Google Scholar 

  15. Hansen, P., Mladenovic, N.: Variable neighborhood search: principles and applications. European Journal of Operational Research 130, 449–467 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  16. Holte, R.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11, 63–90 (1993)

    Article  MATH  Google Scholar 

  17. Hue, C., Boullé, M.: A new probabilistic approach in rank regression with optimal bayesian partitioning. Journal of Machine Learning Research, 2727–2754 (2007)

    Google Scholar 

  18. Kass, G.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29(2), 119–127 (1980)

    Article  Google Scholar 

  19. Langley, P., Iba, W., Thompson, K.: An analysis of Bayesian classifiers. In: 10th National Conference on Artificial Intelligence, pp. 223–228. AAAI Press (1992)

    Google Scholar 

  20. Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: An enabling technique. Data Mining and Knowledge Discovery 4(6), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  21. Nadif, M., Govaert, G.: Block Clustering of Contingency Table and Mixture Model. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 249–259. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  22. Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)

    Google Scholar 

  23. Rifkin, R., Klautau, A.: In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141 (2004)

    MathSciNet  MATH  Google Scholar 

  24. Ritschard, G., Zighed, D.A.: Simultaneous Row and Column Partitioning: Evaluation of a Heuristic. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 468–472. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  25. Yang, Y., Webb, G.: A comparative study of discretization methods for naive-Bayes classifiers. In: Proceedings of the Pacific Rim Knowledge Acquisition Workshop, pp. 159–173 (2002)

    Google Scholar 

  26. Zighed, D., Rakotomalala, R.: Graphes d’induction. Hermes, France (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Boullé .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Berlin Heidelberg

About this chapter

Cite this chapter

Boullé, M. (2012). Simultaneous Partitioning of Input and Class Variables for Supervised Classification Problems with Many Classes. In: Guillet, F., Ritschard, G., Zighed, D. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25838-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25838-1_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25837-4

  • Online ISBN: 978-3-642-25838-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics