Preconditioning an Artificial Neural Network Using Naive Bayes

Zaidi, Nayyar A.; Petitjean, François; Webb, Geoffrey I.

doi:10.1007/978-3-319-31753-3_28

Nayyar A. Zaidi¹⁹,
François Petitjean¹⁹ &
Geoffrey I. Webb¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2522 Accesses
3 Citations

Abstract

Logistic Regression (LR) is a workhorse of the statistics community and a state-of-the-art machine learning classifier. It learns a linear model from inputs to outputs trained by optimizing the Conditional Log-Likelihood (CLL) of the data. Recently, it has been shown that preconditioning LR using a Naive Bayes (NB) model speeds up LR learning many-fold. One can, however, train a linear model by optimizing the mean-square-error (MSE) instead of CLL. This leads to an Artificial Neural Network (ANN) with no hidden layer. In this work, we study the effect of NB preconditioning on such an ANN classifier. Optimizing MSE instead of CLL may lead to a lower bias classifier and hence result in better performance on big datasets. We show that this NB preconditioning can speed-up convergence significantly. We also show that optimizing a linear model with MSE leads to a lower bias classifier than optimizing with CLL. We also compare the performance to state-of-the-art classifier Random Forest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note, we add CLL as subscript to WANBIA-C to show explicitly the objective function that it optimizes.
2.
The original L-BFGS implementation of [12] from http://users.eecs.northwestern.edu/~nocedal/lbfgsb.html is used.

References

Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley and Sons, New York (2006)
MATH Google Scholar
Minka, T.P.: A comparison of numerical optimizers for logistic regression (2003)
Google Scholar
Zaidi, N.A., Cerquides, J., Carman, M.J., Webb, G.I.: Alleviating naive Bayes attribute independence assumption by attribute weighting. J. Mach. Learn. Res. 14, 1947–1988 (2013)
MathSciNet MATH Google Scholar
Zaidi, N.A., Carman, M.J., Cerquides, J., Webb, G.I.: Naive-bayes inspired effective pre-conditioners for speeding-up logistic regression. In: IEEE International Conference on Data Mining (2014)
Google Scholar
Martinez, A., Chen, S., Webb, G.I., Zaidi, N.A.: Scalable learning of Bayesian network classifiers. J. Mach. Learn. Res. (2015) (in press)
Google Scholar
Zaidi, N.A., Webb, G.I., Carman, M.J., Petitjean, F.: Deep broad learning - Big models for Big data (2015). arxiv:1509.01346
Kohavi, R., Wolpert, D.: Bias plus variance decomposition for zero-one loss functions. In: ICML, pp. 275–283 (1996)
Google Scholar
Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)
Article Google Scholar
Brain, D., Webb, G.I.: The need for low bias algorithms in classification learning from small data sets. In: PKDD, pp. 62–73 (2002)
Google Scholar
Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Mach. Learn. 8(1), 87–102 (1992)
MATH Google Scholar
Zhu, C., Byrd, R.H., Nocedal, J.: LBFGSB, fortran routines for large scale bound constrained optimization. ACM Trans. Math. Softw. 23(4), 550–560 (1997)
Article MathSciNet MATH Google Scholar
Byrd, R., Lu, P., Nocedal, J.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Stat. Comput. 16(5), 1190–1208 (1995)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Brain, D., Webb, G.: On the effect of data set size on bias and variance in classification learning. In: Proceedings of the Fourth Australian Knowledge Acquisition Workshop, pp. 117–128. University of New South Wales (1999)
Google Scholar

Download references

Acknowledgements

This research has been supported by the Australian Research Council under grants DP120100553 and DP140100087, and Asian Office of Aerospace Research and Development, Air Force Office of Scientific Research under contracts FA2386-15-1-4007 and FA2386-15-1-4017.

Author information

Authors and Affiliations

Faculty of Information Technology, Monash University, Melbourne, VIC, 3800, Australia
Nayyar A. Zaidi, François Petitjean & Geoffrey I. Webb

Authors

Nayyar A. Zaidi
View author publications
You can also search for this author in PubMed Google Scholar
François Petitjean
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nayyar A. Zaidi .

Editor information

Editors and Affiliations

The University of Melbourne, Melbourne, Victoria, Australia
James Bailey
The University of Texas at Dallas, Richardson, Texas, USA
Latifur Khan
Osaka University, Osaka, Japan
Takashi Washio
University of Auckland, Auckland, New Zealand
Gill Dobbie
Shenzhen University, Shenzhen, China
Joshua Zhexue Huang
Massey University, Auckland, New Zealand
Ruili Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zaidi, N.A., Petitjean, F., Webb, G.I. (2016). Preconditioning an Artificial Neural Network Using Naive Bayes. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-31753-3_28
Published: 12 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics