Abstract
Numerous enhancements have been proposed to mitigate the attribute conditional independence assumption in naive Bayes (NB). However, almost all of them only focus on the original attribute space. Due to the complexity of real-world applications, we argue that the discriminative information provided by the original attribute space might be insufficient for classification. Thus, in this study, we expect to discover some latent attributes beyond the original attribute space and propose a novel two-stage model called attribute augmented and weighted naive Bayes (A2WNB). At the first stage, we build multiple random one-dependence estimators (RODEs). Then we use each built RODE to classify each training instance in turn and define the predicted class labels as its latent attributes. At last, we construct the augmented attributes by concatenating the latent attributes with the original attributes. At the second stage, to alleviate the attribute redundancy, we optimize the augmented attributes’ weights by maximizing the conditional log-likelihood (CLL) of the built model. Extensive experimental results show that A2WNB significantly outperforms NB and all the other existing state-of-the-art competitors.
Similar content being viewed by others
References
Wu X, Kumar V, Quinlan J R, et al. Top 10 algorithms in data mining. Knowl Inf Syst, 2008, 14: 1–37
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn, 1997, 29: 131–163
Webb G I, Boughton J R, Wang Z H. Not so naive Bayes: aggregating one-dependence estimators. Mach Learn, 2005, 58: 5–24
Jiang L X, Zhang H, Cai Z H. A novel Bayes model: hidden naive Bayes. IEEE Trans Knowl Data Eng, 2009, 21: 1361–1371
Qiu C, Jiang L X, Li C Q. Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl, 2015, 42: 5433–5440
Kohavi R. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 1996. 202–207
Frank E, Hall M A, Pfahringer B. Locally weighted naive bayes. In: Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence, 2003. 249–256
Wang S S, Jiang L X, Li C Q. Adapting naive Bayes tree for text classification. Knowl Inf Syst, 2015, 44: 77–89
Jiang L X, Wang D H, Cai Z H. Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools, 2012, 21: 1250007
Jiang L X, Qiu C, Li C Q. A novel minority cloning technique for cost-sensitive learning. Int J Patt Recogn Artif Intell, 2015, 29: 1551004
Xu W Q, Jiang L X, Yu L J. An attribute value frequency-based instance weighting filter for naive Bayes. J Exp Theor Artif Intell, 2019, 31: 225–236
Langley P, Sage S. Induction of selective bayesian classifiers. In: Proceedings of the 10th Annual Conference on Uncertainty in Artificial Intelligence, 1994. 399–406
Chen S, Martinez A M, Webb G I. Highly scalable attribute selection for averaged one-dependence estimators. In: Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014. 86–97
Chen S, Webb G I, Liu L, et al. A novel selective naïve Bayes algorithm. Knowl-Based Syst, 2020, 192: 105361
Hall M. A decision tree-based attribute weighting filter for naive Bayes. Knowl-Based Syst, 2007, 20: 120–126
Zaidi N A, Cerquides J, Carman M J, et al. Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res, 2013, 14: 1947–1988
Jiang L X, Zhang L G, Li C Q, et al. A correlation-based feature weighting filter for naive Bayes. IEEE Trans Knowl Data Eng, 2019, 31: 201–213
Hindi K E. Fine tuning the naïve Bayesian learning algorithm. AI Commun, 2014, 27: 133–141
Diab D M, Hindi K E. Using differential evolution for fine tuning naïve Bayesian classifiers and its application for text classification. Appl Soft Comput, 2017, 54: 183–199
Hindi K E, Aljulaidan R R, AlSalman H. Lazy fine-tuning algorithms for naïve Bayesian text classification. Appl Soft Comput, 2020, 96: 106652
Chen S L, Martinez A M, Webb G I, et al. Sample-based attribute selective An DE for large data. IEEE Trans Knowl Data Eng, 2017, 29: 172–185
Zhang H, Jiang L X, Yu L J. Attribute and instance weighted naive Bayes. Pattern Recogn, 2021, 111: 107674
Duan Z Y, Wang L M, Chen S L, et al. Instance-based weighting filter for superparent one-dependence estimators. Knowl-Based Syst, 2020, 203: 106085
Zhang H, Petitjean F, Buntine W. Bayesian network classifiers using ensembles and smoothing. Knowl Inf Syst, 2020, 62: 3457–3480
Liu Y, Wang L M, Mammadov M. Learning semi-lazy Bayesian network classifier under the c.i.i.d assumption. Knowl-Based Syst, 2020, 208: 106422
Long Y G, Wang L M, Duan Z Y, et al. Robust structure learning of Bayesian network by identifying significant dependencies. IEEE Access, 2019, 7: 116661
Jiang L X. Random one-dependence estimators. Pattern Recogn Lett, 2011, 32: 532–539
Wu J, Pan S R, Zhu X Q, et al. Self-adaptive attribute weighting for naive Bayes classification. Expert Syst Appl, 2015, 42: 1487–1502
Jiang L X, Li C Q, Wang S S, et al. Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artifi Intell, 2016, 52: 26–39
Lee C H. A gradient approach for value weighted classification learning in naive Bayes. Knowl-Based Syst, 2015, 85: 71–79
Lee C H. An information-theoretic filter approach for value weighted classification learning in naive Bayes. Data Knowl Eng, 2018, 113: 116–128
Zhang H, Sheng S L. Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th International Conference on Data Mining, 2004. 567–570
Jiang L X, Zhang L G, Yu L J, et al. Class-specific attribute weighted naive Bayes. Pattern Recogn, 2019, 88: 321–330
Zhang H, Jiang L X, Yu L J. Class-specific attribute value weighting for naive Bayes. Inf Sci, 2020, 508: 260–274
Mahmoudi A, Yaakub M R, Bakar A A. The relationship between online social network ties and user attributes. ACM Trans Knowl Discov Data, 2019, 13: 26
Ali S, Shakeel M H, Khan I, et al. Predicting attributes of nodes using network structure. ACM Trans Intell Syst Technol, 2021, 12: 21
Jiang L X, Cai Z H, Zhang H, et al. Not so greedy: randomly selected naive Bayes. Expert Syst Appl, 2012, 39: 11022–11028
Wu J, Cai Z H. Attribute weighting via differential evolution algorithm for attribute weighted naive bayes (WNB). J Comput Inform Syst, 2011, 7: 1672–1679
Zhu C Y, Byrd R H, Lu P, et al. Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw, 1997, 23: 550–560
Breiman L. Random forests. Mach Learn, 2001, 45: 5–32
Witten I H, Frank E, Hall M A. Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Amsterdam: Elsevier, 2011
Bengio Y, Nadeau C. Inference for the generalization error. Mach Learn, 2003, 52: 239–281
Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res, 2006, 7: 1–30
Olave M, Rajkovic V, Bohanec M. An application for admission in public school systems. Expert Syst Public Admin, 1989, 1: 145–160
Acknowledgements
The work was supported by National Natural Science Foundation of China (Grant No. U1711267), Fundamental Research Funds for the Central Universities (Grant No. CUGGC03), and Foundation of Key Laboratory of Artificial Intelligence, Ministry of Education, China (Grant No. AI2020002).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, H., Jiang, L. & Li, C. Attribute augmented and weighted naive Bayes. Sci. China Inf. Sci. 65, 222101 (2022). https://doi.org/10.1007/s11432-020-3277-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3277-0