When Classification becomes a Problem: Using Branch-and-Bound to Improve Classification Efficiency

Prieditis, Armand; Lee, Moontae

doi:10.1007/978-3-642-39712-7_36

Armand Prieditis²⁰ &
Moontae Lee²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7988))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

4319 Accesses

Abstract

In a typical machine learning classification task there are two phases: training and prediction. This paper focuses on improving the efficiency of the prediction phase. When the number of classes is low, linear search among the classes is an efficient way to find the most likely class. However, when the number of classes is high, linear search is inefficient. For example, some applications such as geolocation or time-based classification might require millions of subclasses to fit the data. Specifically, this paper describes a branch-and-bound method to search for the most likely class where the training examples can be partitioned into thousands of subclasses. To get some idea of the performance of branch-and-bound classification, we generated a synthetic set of random trees comprising billions of classes and evaluated branch-and-bound classification. Our results show that branch-and-bound classification is effective when the number of classes is large. Specifically, branch-and-bound improves search efficiency logarithmically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Moore, A.: K-Means and Hierchical Clustering (2001), tutorial at http://www.autonlab.org/tutorials/kmeans11.pdf
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 1–38 (1977)
Google Scholar
Prieditis, A.E.: Machine discovery of effective admissible heuristics. Machine Learning 12(1-3), 117–141 (1993)
Article Google Scholar
Nowozin, S.: Improved information gain estimates for decision tree induction. arXiv preprint arXiv:1206.4620 (2012)
Google Scholar
Land, A.H., Doig, A.G.: An automatic method of solving discrete programming problems. Econometrica 28(3), 497–520 (1960)
Article MathSciNet MATH Google Scholar
Nilsson, N.: Principles of Artificial Intelligence. Tioga Publishing Company (1980)
Google Scholar
Korf, R.: Depth-first Iterative-Deepening: An Optimal Admissible Tree Search. Artificial Intelligence 27, 97–109 (1985)
Article MathSciNet MATH Google Scholar
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann (2006)
Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9), 509–517 (1975)
Article MathSciNet MATH Google Scholar
Moore, A.: Very Fast EM-based Mixture Model Clustering Using Multiresolution kd-trees. In: Advances in Neural Information Processing Systems. Morgan Kaufmann (1999)
Google Scholar
Moore, A., Schneider, J., et al.: Efficient Locally Weighted Polynomial Regression Predictions. In: Fourteenth International Conference on Machine Learning. Morgan Kaufmann (1997)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T., Li, F.: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research 5, 361–397 (2004), http://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf
Google Scholar
Gene Ontology dataset, http://www.geneontology.org/

Download references

Author information

Authors and Affiliations

Neustar Labs, Mountain View, CA, USA, 94041
Armand Prieditis & Moontae Lee

Authors

Armand Prieditis
View author publications
You can also search for this author in PubMed Google Scholar
Moontae Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prieditis, A., Lee, M. (2013). When Classification becomes a Problem: Using Branch-and-Bound to Improve Classification Efficiency. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-39712-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics