Skip to main content

On the Complexity of Optimal Multisplitting

  • Conference paper
  • First Online:
Foundations of Intelligent Systems (ISMIS 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1932))

Included in the following conference series:

  • 687 Accesses

Abstract

Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter—like most well-known attribute evaluation functions—takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, A., Klawe, M.M., Moran, S., Shor, P., Wilber, R.: Geometric Applications of a Matrix Searching Algorithm. Algorithmica 2 (1987) 195–208

    Article  MATH  MathSciNet  Google Scholar 

  2. Auer, P.: Optimal Splits of Single Attributes. Unpublished manuscript, Institute for Theoretical Computer Science, Graz University of Technology (1997)

    Google Scholar 

  3. Birkendorf, A.: On Fast and Simple Algorithms for Finding Maximal Subarrays and Applications in Learning Theory. In: Ben-David, S. (ed.): Computational Learning Theory. Lecture Notes in Artificial Intelligence, Vol. 1208, Springer-Verlag, Berlin Heidelberg New York (1997) 198–209

    Google Scholar 

  4. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley and Sons, New York (1991)

    Book  MATH  Google Scholar 

  5. Elomaa, T., Rousu, J.: General and Efficient Multisplitting of Numerical Attributes. Mach. Learn. 36 (1999) 201–244

    Article  MATH  Google Scholar 

  6. Elomaa, T., Rousu, J.: Speeding Up the Search for Optimal Partitions. In: żytkow, J., Rauch, J. (eds.): Principles of Data Mining and Knowledge Discovery. Lecture Notes in Artificial Intelligence, Vol. 1704, Springer-Verlag, Berlin Heidelberg New York (1999) 89–97

    Google Scholar 

  7. Elomaa, T., Rousu, J.: Generalizing Boundary Points. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence. AAAI Press, Menlo Park, CA (2000) to appear

    Google Scholar 

  8. Fayyad, U.M., Irani, K.B.: On the Handling of Continuous-Valued Attributes in Decision Tree Generation. Mach. Learn. 8 (1992) 87–102

    MATH  Google Scholar 

  9. Fulton, T., Kasif, S., Salzberg, S.: Effcient Algorithms for Finding Multi-Way Splits for Decision Trees. In: Prieditis, A., Russell, S. (eds.): Machine Learning: Proceedings of the Twelfth International Conference. Morgan Kaufmann, San Francisco, CA (1995) 244–251

    Google Scholar 

  10. Galil, Z., Park, K.: A Linear-Time Algorithm for Concave One-Dimensional Dynamic Programming. Inf. Process. Lett. 33 (1990) 309–311

    Article  MATH  MathSciNet  Google Scholar 

  11. Galil, Z., Park, K.: Dynamic Programming with Convexity, Concavity and Sparsity. Theor. Comput. Sci. 92 (1992) 49–76

    Article  MATH  MathSciNet  Google Scholar 

  12. López de Màntaras, R.: A Distance-Based Attribute Selection Measure for Decision Tree Induction. Mach. Learn. 6 (1991) 81–92

    Article  Google Scholar 

  13. Quinlan, J.R.: Induction of Decision Trees. Mach. Learn. 1 (1986) 81–106

    Google Scholar 

  14. Zighed, D.A., Rakotomalala, R., Feschet, F.: Optimal Multiple Intervals Discretization of Continuous Attributes for Supervised Learning. In: Heckerman, D. et al. (eds.), Proceedings of the Third International Conference on Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA (1997) 295–298

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Elomaa, T., Rousu, J. (2000). On the Complexity of Optimal Multisplitting. In: RaĹ›, Z.W., Ohsuga, S. (eds) Foundations of Intelligent Systems. ISMIS 2000. Lecture Notes in Computer Science(), vol 1932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39963-1_58

Download citation

  • DOI: https://doi.org/10.1007/3-540-39963-1_58

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41094-2

  • Online ISBN: 978-3-540-39963-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics