Hybrid imputation-based optimal evidential classification for missing data

Zhang, Zhen; Tian, Hong-peng

doi:10.1007/s10489-024-05950-9

Hybrid imputation-based optimal evidential classification for missing data

Published: 02 December 2024

Volume 55, article number 69, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

119 Accesses
Explore all metrics

Abstract

Classifying incomplete data remains a challenging task, as missing values can provide uncertain and imprecise information that reduces classification performance. To address this issue, we proposed a hybrid imputation-based optimal evidential classification (HOEC) method for missing data under the Dempster-Shafer theory framework. The proposed HOEC method can capture uncertainty and imprecision during imputation and classification procedures. Specifically, a hybrid imputation strategy was developed to estimate the missing values in the training and test sets by combining single and multiple imputations. Thus, we obtained accurate estimations and captured their uncertainties. An optimal evidential partition rule was then designed to adaptively submit an incomplete sample to a singleton class or meta-class under the Dempster-Shafer theory framework. Therefore, we can capture the imprecision caused by missing values and reduce classification errors. Experiments on several incomplete datasets demonstrated the effectiveness of the HOEC method compared with related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Imputation and Ensemble Learning for Classification with Incomplete Data

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

Missing data imputation using decision trees and fuzzy clustering with iterative learning

Article 11 December 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability and Access

The datasets analyzed in this study are available at the UCI repository (http://archive.ics.uci.edu/).

Notes

The incomplete sample $\textbf{y}_j$ is directly submitted to a specific singleton class if its KNNs in X come from one class, that is, $V=1$.
For ease of understanding, we assume that ${\textbf {y}}$ is difficult to distinguish between the two singleton classes (${\omega _\varphi }$ and ${\omega _{\max }}$), that is, $\{{\omega _\varphi },{\omega _{\max }}\}$ is the most likely meta-class that the sample ${{\textbf {y}}}_j$ may belong to.

References

Chen Y, Huang C, Lo Y, Chen Y, Lai F (2022) Combining attention with spectrum to handle missing values on time series data without imputation. Inf Sci 609:1271–1287
Article MATH Google Scholar
Liu X, Du S, Li T, Teng F, Yang Y (2023) A missing value filling model based on feature fusion enhanced autoencoder. Appl Intell 53(21):24931–24946
Article Google Scholar
Wang W, Zhan J, Herrera-Viedma E (2022) A three-way decision approach with a probability dominance relation based on prospect theory for incomplete information systems. Inf Sci 611:199–224
Article MATH Google Scholar
Buonanno A, Di Gennaro G, Graditi G, Nogarotto A, Palmieri FA, Valenti M (2023) Fusion of energy sensors with missing values. Appl Intell 1–15
Little RJ, Rubin DB (2019) Statistical Analysis with Missing Data vol. 793. John Wiley & Sons, Inc., second edition
Sun Y, Li J, Xu Y, Zhang T, Wang X (2023) Deep learning versus conventional methods for missing data imputation: A review and comparative study. Expert Syst Appl 120201
Mundfrom DJ, Whitcomb A (1998) Imputing missing values: The effect on the accuracy of classification
Brás LP, Menezes JC (2007) Improving cluster-based missing value estimation of dna microarray data. Biomolecular Eng 24(2):273–282
Article MATH Google Scholar
Zhang K, Zhou F, Wu L, Xie N, He Z (2024) Semantic understanding and prompt engineering for large-scale traffic data imputation. Inf Fusion 102:102038
Article Google Scholar
Qin J, Fu W, Gao H, Zheng WX (2016) Distributed $ k $-means algorithm and fuzzy $ c $-means algorithm for sensor networks based on multiagent consensus theory. IEEE Trans Cybernet 47(3):772–783
Article MATH Google Scholar
Dai J, Hu H, Hu Q, Huang W, Zheng N, Liu L (2017) Locally linear approximation approach for incomplete data. IEEE Trans Cybernetics 48(6):1720–1732
Article MATH Google Scholar
Liu S, Zhang J, Xiang Y, Zhou W (2017) Fuzzy-based information decomposition for incomplete and imbalanced data learning. IEEE Trans Fuzzy Syst 25(6):1476–1490
Article MATH Google Scholar
Karmitsa N, Taheri S, Bagirov A, Mäkinen P (2020) Missing value imputation via clusterwise linear regression. IEEE Trans Knowl Data Eng 34(4):1889–1901
MATH Google Scholar
Ali A, Abu-Elkheir M, Atwan A, Elmogy M (2023) Missing values imputation using fuzzy k-top matching value. J King Saud University-Comput Inf Sci 35(1):426–437
Google Scholar
Zahin SA, Ahmed CF, Alam T (2018) An effective method for classification with missing values. Appl Intell 48:3209–3230
Article MATH Google Scholar
Kenward MG, Carpenter J (2007) Multiple imputation: current perspectives. Stat Methods Med Res 16(3):199–218
Hu Y, Yang Z, Hou W (2023) Multiple Receding Imputation of Time Series Based on Similar Conditions Screening. IEEE Trans Knowl Data Eng 35(3):2837–2846
Article MATH Google Scholar
Faisal S, Tutz G (2021) Multiple imputation using nearest neighbor methods. Inf Sci 570:500–516
Article MathSciNet MATH Google Scholar
Zhao F, Lu Y, Li X, Wang L, Song Y, Fan D, Zhang C, Chen X (2022) Multiple imputation method of missing credit risk assessment data based on generative adversarial networks. Appl Soft Comput 126:109273
Article Google Scholar
Zhao F, Lu Y, Li X, Wang L, Song Y, Fan D, Zhang C, Chen X (2022) Multiple imputation method of missing credit risk assessment data based on generative adversarial networks. Appl Soft Comput 126:109273
Article Google Scholar
Shafer GA (1978) A mathematical theory of evidence. Technometrics 20(1):106–106
Article MATH Google Scholar
Denœux T (2023) Quantifying Prediction Uncertainty in Regression Using Random Fuzzy Sets: The ENNreg Model. IEEE Trans Fuzzy Syst 31(10):3690–3699
Article MATH Google Scholar
Smets P (1990) The combination of evidence in the transferable belief model. IEEE Trans Pattern Anal Mach Intell 12(5):447–458
Article MATH Google Scholar
Liu Z, Pan Q, Mercier G, Dezert J (2014) A new incomplete pattern classification method based on evidential reasoning. IEEE Trans Cybernetics 45(4):635–646
Article MATH Google Scholar
Liu Z, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recognition 52:85–95
Article MATH Google Scholar
Ma Z, Tian H, Liu Z, Zhang Z (2020) A new incomplete pattern belief classification method with multiple estimations based on knn. Appl Soft Comput 90:106175
Article MATH Google Scholar
Zhang Z, Tian H, Yan L, Martin A, Zhou K (2021) Learning a credal classifier with optimized and adaptive multiestimation for missing data imputation. IEEE Trans Syst, Man, Cybernetics: Syst 52(7):4092–4104
Article MATH Google Scholar
Zhang Z, Ye S, Zhang Y, Ding W, Wang H (2022) Belief combination of classifiers for incomplete data. IEEE/CAA J Automatica Sinica 9(4):652–667
Article MATH Google Scholar
Cui H, Zhang H, Chang Y, Kang B (2023) Bgc: Belief gravitational clustering approach and its application in the counter-deception of belief functions. Eng Appl Artif Intell 123:106235
Zhang Z, Ye S, Liu Z, Wang H, Ding W (2023) Deep Hyperspherical Clustering for Skin Lesion Medical Image Segmentation. IEEE J Biomed Health Inf 27(8):3770–3781
Article MATH Google Scholar
Jiao L, Yang H, Wang F, Liu Z, Pan Q (2023) Dtec: Decision tree-based evidential clustering for interpretable partition of uncertain data. Pattern Recognition 144:109846
Article Google Scholar
Zhang Z, Liu Z, Martin A, Zhou K (2022) Bsc: Belief shift clustering. IEEE Trans Syst, Man, Cybernetics: Syst 53(3):1748–1760
Article MATH Google Scholar
Xiao F (2022) GEJS: A generalized evidential divergence measure for multisource information fusion. IEEE Trans Syst, Man, Cybernetics: Syst 53(4):2246–2258
Article MATH Google Scholar
Hua Z, Jing X (2023) An improved belief hellinger divergence for dempster-shafer theory and its application in multi-source information fusion. Appl Intell 1–20
Zhang XX, Wang YM, Chen SQ, Chen L (2021) Discrete-valued belief structures combination and normalization using evidential reasoning rule. Appl Intell 51:1379–1393
Article MATH Google Scholar
Denoeux T (1995) A k-nearest neighbor classification rule based on dempster-shafer theory. IEEE Trans Syst, Man, Cybernetics 25(5):804–813
Article MATH Google Scholar
Zhang Z (2016) Introduction to machine learning: k-nearest neighbors. Annal Translational Med 4(11):218–218
Article MATH Google Scholar
Rokach L (2016) Decision forest: Twenty years of research. Inf Fusion 27:111–125
Article MATH Google Scholar
Liu Z, Pan Q, Dezert J, Han J, He Y (2017) Classifier fusion with contextual reliability evaluation. IEEE Trans Cybernetics 48(5):1605–1618
Article MATH Google Scholar
Frank A (2010) Uci machine learning repository. http://archive.ics.uci.edu/ml

Download references

Acknowledgements

This study was partially supported by the National Key Research and Development Program of China (No. 2018YFCXXXXXXX), Henan Major Public Welfare Project (No. 201300311200), and Henan Key Research and Development Project(No. 231111211600).

Author information

Hong-peng Tian contributed equally to this work.

Authors and Affiliations

School of Electrical and Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou, Henan, 450001, China
Zhen Zhang & Hong-peng Tian

Authors

Zhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hong-peng Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zhen Zhang: Methodology, Supervision, Writing and Editing. Hong-peng Tian: Software, Methodology, and Original draft preparation.

Corresponding author

Correspondence to Hong-peng Tian.

Ethics declarations

Competing Interests

The authors declare that they have no competing financial interests or personal relationships that may have influenced the work reported in this study.

Ethical and Informed Consent for Data Used

We declare that this study is an original work and has not been published or submitted elsewhere. We confirm that the order of the authors listed in the manuscript was approved by all authors and that informed consent was obtained from all authors involved in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Z., Tian, Hp. Hybrid imputation-based optimal evidential classification for missing data. Appl Intell 55, 69 (2025). https://doi.org/10.1007/s10489-024-05950-9

Download citation

Accepted: 11 October 2024
Published: 02 December 2024
DOI: https://doi.org/10.1007/s10489-024-05950-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid imputation-based optimal evidential classification for missing data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Imputation and Ensemble Learning for Classification with Incomplete Data

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

Missing data imputation using decision trees and fuzzy clustering with iterative learning

Data Availability and Access

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Hybrid imputation-based optimal evidential classification for missing data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Imputation and Ensemble Learning for Classification with Incomplete Data

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

Missing data imputation using decision trees and fuzzy clustering with iterative learning

Explore related subjects

Data Availability and Access

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation