Skip to main content

When Classification becomes a Problem: Using Branch-and-Bound to Improve Classification Efficiency

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7988))

  • 4319 Accesses

Abstract

In a typical machine learning classification task there are two phases: training and prediction. This paper focuses on improving the efficiency of the prediction phase. When the number of classes is low, linear search among the classes is an efficient way to find the most likely class. However, when the number of classes is high, linear search is inefficient. For example, some applications such as geolocation or time-based classification might require millions of subclasses to fit the data. Specifically, this paper describes a branch-and-bound method to search for the most likely class where the training examples can be partitioned into thousands of subclasses. To get some idea of the performance of branch-and-bound classification, we generated a synthetic set of random trees comprising billions of classes and evaluated branch-and-bound classification. Our results show that branch-and-bound classification is effective when the number of classes is large. Specifically, branch-and-bound improves search efficiency logarithmically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moore, A.: K-Means and Hierchical Clustering (2001), tutorial at http://www.autonlab.org/tutorials/kmeans11.pdf

  2. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 1–38 (1977)

    Google Scholar 

  3. Prieditis, A.E.: Machine discovery of effective admissible heuristics. Machine Learning 12(1-3), 117–141 (1993)

    Article  Google Scholar 

  4. Nowozin, S.: Improved information gain estimates for decision tree induction. arXiv preprint arXiv:1206.4620 (2012)

    Google Scholar 

  5. Land, A.H., Doig, A.G.: An automatic method of solving discrete programming problems. Econometrica 28(3), 497–520 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  6. Nilsson, N.: Principles of Artificial Intelligence. Tioga Publishing Company (1980)

    Google Scholar 

  7. Korf, R.: Depth-first Iterative-Deepening: An Optimal Admissible Tree Search. Artificial Intelligence 27, 97–109 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  8. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann (2006)

    Google Scholar 

  9. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9), 509–517 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  10. Moore, A.: Very Fast EM-based Mixture Model Clustering Using Multiresolution kd-trees. In: Advances in Neural Information Processing Systems. Morgan Kaufmann (1999)

    Google Scholar 

  11. Moore, A., Schneider, J., et al.: Efficient Locally Weighted Polynomial Regression Predictions. In: Fourteenth International Conference on Machine Learning. Morgan Kaufmann (1997)

    Google Scholar 

  12. Lewis, D.D., Yang, Y., Rose, T., Li, F.: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research 5, 361–397 (2004), http://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf

    Google Scholar 

  13. Gene Ontology dataset, http://www.geneontology.org/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Prieditis, A., Lee, M. (2013). When Classification becomes a Problem: Using Branch-and-Bound to Improve Classification Efficiency. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39712-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39711-0

  • Online ISBN: 978-3-642-39712-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics