Techniques for Efficient Learning without Search

Salem, Houssam; Suraweera, Pramuditha; Webb, Geoffrey I.; Boughton, Janice R.

doi:10.1007/978-3-642-30217-6_5

Houssam Salem²³,
Pramuditha Suraweera²³,
Geoffrey I. Webb²³ &
…
Janice R. Boughton²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7301))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2963 Accesses

Abstract

Averaged n-Dependence Estimators (AnDE) is a family of learning algorithms that range from low variance coupled with high bias through to high variance coupled with low bias. The asymptotic error of the lowest bias variant is the Bayes optimal. The AnDE family of algorithms have a training time that is linear with respect to the training examples, learn in a single pass through the data, support incremental learning, handle missing values directly and are robust in the face of noise. These characteristics make the algorithms particularly well suited to learning from large data. However, for higher orders of n they are very computationally demanding. This paper presents data structures and algorithms developed to reduce both memory and time for training and classification. These enhancements have enabled the evaluation and comparison of A3DE’s effectiveness. The results provide further support for the hypothesis that as the number of training examples increases, decreasing error will be attained by members of the AnDE family with increasing levels of n.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

United Statistical Algorithms and Data Science: An Introduction to the Principles

Fast – Asymptotically Optimal – Methods for Determining the Optimal Number of Features

When is the Naive Bayes approximation not so naive?

Article 21 July 2017

References

Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html
Boyer, B.: Robust Java benchmarking (2008), http://www.ibm.com/developerworks/java/library/j-benchmark1.html
Brain, D., Webb, G.I.: The Need for Low Bias Algorithms in Classification Learning From Large Data Sets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 62–73. Springer, Heidelberg (2002)
Chapter Google Scholar
Coffey, N.: Classmexer agent, http://www.javamex.com/classmexer/
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th Int. Joint Conference on Artificial Intelligence, pp. 1022–1029. Morgan Kaufmann (1993)
Google Scholar
Hui, B., Yang, Y., Webb, G.I.: Anytime classification for a pool of instances. Machine Learning 77(1), 61–102 (2009)
Article Google Scholar
Keogh, E., Pazzani, M.: Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches. In: Proc. of the International Workshop on Artificial Intelligence and Statistics, pp. 225–230 (1999)
Google Scholar
Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2), 159–196 (2000)
Article Google Scholar
Webb, G.I., Boughton, J., Wang, Z.: Not so naive Bayes: Aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)
Article MATH Google Scholar
Webb, G.I., Boughton, J., Zheng, F., Ting, K.M., Salem, H.: Learning by extrapolation from marginal to full-multivariate probability distributions: Decreasingly naive Bayesian classification. Machine Learning 86(2), 233–272 (2012), doi:10.1007/s10994-011-5263-6
Article Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005)
Google Scholar
Zheng, F., Webb, G.I.: A comparative study of semi-naive bayes methods in classification learning. In: Simoff, S.J., Williams, G.J., Galloway, J., Kolyshakina, I. (eds.) Proc. of the 4th Australasian Data Mining Conference (AusDM 2005), pp. 141–156 (2005)
Google Scholar
Zheng, Z., Webb, G.I.: Lazy learning of Bayesian rules. Machine Learning 41(1), 53–84 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Monash University, VIC, 3800, Australia
Houssam Salem, Pramuditha Suraweera, Geoffrey I. Webb & Janice R. Boughton

Authors

Houssam Salem
View author publications
You can also search for this author in PubMed Google Scholar
Pramuditha Suraweera
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar
Janice R. Boughton
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Michigan State University, 428 S. Shaw Lane, 48824-1226, East Lansing, MI, USA
Pang-Ning Tan
School of Information Technologies, University of Sydney, 1 Cleveland St., 2006, Sydney, NSW, Australia
Sanjay Chawla
Faculty of Computing and Informatics, Jalan Multimedia, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia
Chin Kuan Ho
Department of Computing and Information Systems, The University of Melbourne, 111 Barry Street, 3053, Melbourne, VIC, Australia
James Bailey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salem, H., Suraweera, P., Webb, G.I., Boughton, J.R. (2012). Techniques for Efficient Learning without Search. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-30217-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30216-9
Online ISBN: 978-3-642-30217-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics