Probability knowledge acquisition from unlabeled instance based on dual learning

Zhao, Yuetan; Wang, Limin; Zhu, Xinyu; Jin, Taosheng; Sun, Minghui; Li, Xiongfei

doi:10.1007/s10115-024-02238-9

Probability knowledge acquisition from unlabeled instance based on dual learning

Regular Paper
Published: 25 September 2024

Volume 67, pages 521–547, (2025)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yuetan Zhao^1,2^na1,
Limin Wang^1,2^na1,
Xinyu Zhu³,
Taosheng Jin⁴,
Minghui Sun¹ &
…
Xiongfei Li²

175 Accesses
Explore all metrics

Abstract

The functionality of machine learning algorithms heavily relies on the abundance and quality of training data accessible. However, the data may originate from diverse data subspaces, making labeled training set typically offers only limited insights, inadequately representing the entirety of potential scenarios. How to safely make use of the unlabeled instances is an emerging and interesting problem for learning Bayesian network classifiers (BNCs), which graphically model the probabilistic relationships among variables in the form of directed acyclic graph (DAG). In this paper, we introduce dual learning into the learning procedure to realize the safe exploitation of the unlabeled instances. We elucidate the mapping between information metric and the local DAG, as well as the distinction between informational (in)dependence and probabilistic (in)dependence. Building upon this foundation, we propose new metrics to accurately measure attribute dependencies within unlabeled instances. The proposed dual learning-based flexible selective k-dependence Bayesian network classifier (DL-FSKDB) employs eager learning to construct the initial model and incorporates lazy learning for personalized fine-tuning and optimization. The extensive experimental evaluations across 36 datasets spanning various domains with distinct properties reveal that the learned BNCs demonstrate competitive classification performance in comparison with state-of-the-art learners in terms of zero–one loss, bias and variance, as well as F1-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised learning for k-dependence Bayesian classifiers

Article 08 July 2021

Efficient heuristics for learning scalable Bayesian network classifier from labeled and unlabeled data

Article 26 January 2024

Target Learning: A Novel Framework to Mine Significant Dependencies for Unlabeled Data

Data availability

All datasets used in the paper are publicly available http://archive.ics.uci.edu/ml

References

Yu T, Kumar A, Chebotar Y, Hausman K, Finn C, Levine S (2022) How to leverage unlabeled data in offline reinforcement learning. In: ICML, vol 162, pp 25611–25635
Liu Y, Wang L, Mammadov M (2020) Learning semi-lazy Bayesian network classifier under the ciid assumption. Knowl Based Syst 208:106–132
Article MATH Google Scholar
Wang L, Zhou J, Wei J, Pang M, Sun M (2022) Learning causal Bayesian networks based on causality analysis for classification. Eng Appl Artif Intell 114:105–138
Article MATH Google Scholar
Chickering DM (1995) Learning Bayesian networks is np-complete. In: AISTATS, pp 121–130
Lewis DD (1998) Naive (bayes) at forty: the independence assumption in information retrieval. In: ECML, vol 1398, pp 4–15
Inza I, Larrañaga P, Etxeberria R, Sierra B (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123(1–2):157–184
Article MATH Google Scholar
Pernkopf F (2004) Bayesian network classifiers versus k-NN classifier using sequential feature selection. In: AAAI, pp 360–365
Rafla M, Voisine N, Crémilleux B, Boullé M (2022) A non-parametric Bayesian approach for uplift discretization and feature selection. In: ECML/PKDD, vol 13717, pp 239–254
Jiang L, Zhang L, Li C, Wu J (2018) A correlation-based feature weighting filter for Naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213
Article MATH Google Scholar
Wang L, Xie Y, Pang M, Wei J (2022) Alleviating the attribute conditional independence and IID assumptions of averaged one-dependence estimator by double weighting. Knowl Based Syst 250:109–131
Article MATH Google Scholar
Zhang H, Jiang L, Zhang W, Li C (2023) Multi-view attribute weighted Naive Bayes. IEEE Trans Knowl Data Eng 35(7):7291–7302
MATH Google Scholar
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Article MATH Google Scholar
Sahami M (1996) Learning limited dependence Bayesian classifiers. In: KDD, pp 335–338
Wang L, Li L, Li Q, Li K (2024) Learning high-dependence Bayesian network classifier with robust topology. Expert Syst Appl 239:122–145
Article MATH Google Scholar
Frank E, Hall MA, Pfahringer B (2003) Locally weighted Naive Bayes. In: UAI, pp 249–256
Duan Z, Wang L, Chen S, Sun M (2020) Instance-based weighting filter for superparent one-dependence estimators. Knowl Based Syst 203:106–132
Article MATH Google Scholar
Zheng Z, Webb GI (2000) Lazy learning of Bayesian rules. Mach Learn 41:53–84
Article MATH Google Scholar
Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55
Article MathSciNet MATH Google Scholar
Gao Y, Gong M, Xie Y, Qin AK, Pan K, Ong Y-S (2022) Multiparty dual learning. IEEE Trans Cybern 53(5):2955–2968
Article Google Scholar
Jiang L, Cai Z, Wang D, Zhang H (2012) Improving tree augmented Naive Bayes for class probability estimation. Knowl Based Syst 26:239–245
Article MATH Google Scholar
Martınez AM, Webb GI, Chen S, Zaidi NA (2016) Scalable learning of Bayesian network classifiers. J Mach Learn Res 17(44):1–35
MathSciNet MATH Google Scholar
Rubio A, Gámez JA (2011) Flexible learning of k-dependence Bayesian network classifiers. In: GECCO, pp 1219–1226
Ren H, Guo Q (2023) Flexible learning tree augmented naïve classifier and its application. Knowl Based Syst 260:110–140
Article MATH Google Scholar
Liu Y, Wang L, Mammadov M, Chen S, Wang G, Qi S, Sun M (2021) Hierarchical independence thresholding for learning Bayesian network classifiers. Knowl Based Syst 212:106–127
Article MATH Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article MATH Google Scholar
Jiang L, Zhang H (2006) Lazy averaged one-dependence estimators. In: Canadian AI, vol 4013, pp 515–525
Webb GI, Boughton JR, Wang Z (2005) Not so Naive Bayes: aggregating one-dependence estimators. Mach Learn 58:5–24
Article MATH Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article MATH Google Scholar
Brown G, Pocock A, Zhao M-J, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13(1):27–66
MathSciNet MATH Google Scholar
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, pp 856–863
Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5(9):1531–1555
MathSciNet MATH Google Scholar
Yang HH, Moody JE (1999) Data visualization and feature selection: new algorithms for nongaussian data. In: NIPS, pp 687–702
Jiang L, Zhang H, Cai Z (2008) A novel Bayes model: nidden Naive Bayes. IEEE Trans Knowl Data Eng 21(10):1361–1371
Article MATH Google Scholar
Jiang L, Zhang H, Cai Z, Wang D (2012) Weighted average of one-dependence estimators. J Exp Theor Artif Intell 24(2):219–230
Article MATH Google Scholar
Kong H, Wang L (2023) Flexible model weighting for one-dependence estimators based on point-wise independence analysis. Pattern Recogn 139:109–139
Article MATH Google Scholar
Khan MA, Pečarić J, Chu Y-M (2020) Refinements of Jensen’s and Mcshane’s inequalities with applications. AIMS Math 5(5):4931–4945
Article MathSciNet MATH Google Scholar
Reichenbach H (1971) The theory of probability. University of California Press, Oakland
MATH Google Scholar
Cestnik B (1990) Estimating probabilities: a crucial task in machine learning. In: ECAI, pp 147–149
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29:103–130
Article MATH Google Scholar
Kohavi R, Wolpert DH et al (1996) Bias plus variance decomposition for zero-one loss functions. In: ICML, vol 96, pp 275–283
Pillai I, Fumera G, Roli F (2017) Designing multi-label classifiers that maximize f measures: state of the art. Pattern Recogn 61:394–404
Article MATH Google Scholar
Wang L, Wang J, Guo L, Li Q (2024) Efficient heuristics for learning scalable Bayesian network classifier from labeled and unlabeled data. Appl Intell 54(2):1957–1979
Article MATH Google Scholar
Wang L, Wang L, Guo L, Li Q, Li X (2023) Exploring complex multivariate probability distributions with simple and robust Bayesian network topology for classification. Appl Intell 53(24):29799–29817
Article MATH Google Scholar
Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp 1022–1029
Ortigosa-Hernández J, Inza I, Lozano JA (2017) Measuring the class-imbalance extent of multi-class problems. Pattern Recogn Lett 98:32–38
Article MATH Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Article MATH Google Scholar
Nemenyi PB (1963) Distribution-free multiple comparisons. Princeton University, Princeton
MATH Google Scholar

Download references

Acknowledgements

This work is supported by the Open Research Project of the Hubei Key Laboratory of Intelligent Geo-Information Processing, China (No. KLIGIP-2021A04), the Scientific and Technological Development Scheme of Jilin Province, China (No. 20240101371JC), and the Key Research Project of Sports Science in Jilin Province (No. 202423).

Author information

Yuetan Zhao and Limin Wang have contributed equally to this work.

Authors and Affiliations

College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Yuetan Zhao, Limin Wang & Minghui Sun
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
Yuetan Zhao, Limin Wang & Xiongfei Li
College of Software, Jilin University, Changchun, 130012, China
Xinyu Zhu
College of Environmental Science and Engineering, Nankai University, Tianjin, 300350, China
Taosheng Jin

Authors

Yuetan Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Limin Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xinyu Zhu
View author publications
You can also search for this author inPubMed Google Scholar
Taosheng Jin
View author publications
You can also search for this author inPubMed Google Scholar
Minghui Sun
View author publications
You can also search for this author inPubMed Google Scholar
Xiongfei Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Y.Z. handled methodology, visualization, and wrote the original draft. L.W. was responsible for conceptualization, supervision, writing (review and editing), and funding acquisition. X.Z. performed formal analysis, investigation, and resources. T.J. contributed to investigation, resources, and validation. M.S. and X.L. were involved in investigation, validation, and review.

Corresponding author

Correspondence to Minghui Sun.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Code availability

https://github.com/Bayes514/DL-FSKDB.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

see Table 6.

Table 6 A summary table of the statistics employed

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, Y., Wang, L., Zhu, X. et al. Probability knowledge acquisition from unlabeled instance based on dual learning. Knowl Inf Syst 67, 521–547 (2025). https://doi.org/10.1007/s10115-024-02238-9

Download citation

Received: 01 June 2024
Revised: 13 August 2024
Accepted: 11 September 2024
Published: 25 September 2024
Issue Date: January 2025
DOI: https://doi.org/10.1007/s10115-024-02238-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability knowledge acquisition from unlabeled instance based on dual learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semi-supervised learning for k-dependence Bayesian classifiers

Efficient heuristics for learning scalable Bayesian network classifier from labeled and unlabeled data

Target Learning: A Novel Framework to Mine Significant Dependencies for Unlabeled Data

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now