research-article

Imbalanced-type Incomplete Data Fuzzy Modeling and Missing Value Imputations

Authors:

Xiaochen Lai,

Yidan Lu,

Liyong Zhang,

Yi Feng,

Genglin ZhangAuthors Info & Claims

ICMLSC '21: Proceedings of the 2021 5th International Conference on Machine Learning and Soft Computing

Pages 33 - 37

https://doi.org/10.1145/3453800.3453807

Published: 18 June 2021 Publication History

Get Access

Abstract

Missing values are a common phenomenon in real-world datasets, which caused by many factors such as errors in data acquisition or storage, equipment failure, or human fault in storage. Incomplete data modeling and missing values imputation have become an increasingly important task. Since the regression relationship between attributes is usually different in different clusters, this paper proposes a method called DS-TS-ALI model to model incomplete data that rely on clusters. The precise regression model between attributes is established for incomplete data in the framework of Takagi-Sugeno (TS) fuzzy model. In the premise parameter identification part, a distance density (DS) algorithm based on a partial distance strategy is proposed given the distribution of data categories is imbalanced. Moreover, a membership reconstruction strategy is proposed on this basis. In the consequence parameter identification part, we propose an alternating iterative (ALI) scheme which treats missing values as variables to identify the parameters of the attribute regression model. The imputation will be completed at the end of the modeling process. Experiments on several datasets are conducted to demonstrate the effectiveness of the proposed method.

References

[1]

Miao, X., Gao, Y., Guo, S., Liu, W. 2018. Incomplete data management: a survey. Frontiers of Computer ence, 12(1), 4-25.

Google Scholar

[2]

Batista, G. E. A. P. A., Monard, M. C. 2003. A Study of K-Nearest Neighbour as an Imputation Method. His.

Google Scholar

[3]

Butera, N. M., Li, S., Evenson, K. R., Di, C., Herring, A. 2018. Hot deck multiple imputation for handling missing accelerometer data. Statistics in Biosciences, 11(2).

Google Scholar

[4]

Crambes, C., Henchiri, Y. 2018. Regression imputation in the functional linear model with missing values in the response. Journal of Statistical Planning and Inference.

Google Scholar

[5]

Sousa, J. M. C., Kaymak, U. 2002. [world scientific series in robotics and intelligent systems] fuzzy decision making in modeling and control volume 27 || advanced optimization issues.,10.1142/4900, 263-279.

Crossref

Google Scholar

[6]

Takagi, T., Sugeno, M. 1985. Fuzzy identification of systems and its applications to modeling and control. Systems, Man and Cybernetics, IEEE Transactions on.

Google Scholar

[7]

Zhou, K., Yang, S. 2016. Exploring the uniform effect of FCM clustering: a data distribution perspective. Knowledge-Based Systems, 96(Mar.15), 76-83.

Google Scholar

[8]

Lu, X., Wu, Q., Zhou, Y., Ma, Y., Ma, C. 2019. A dynamic swarm firefly algorithm based on chaos theory and max-min distance algorithm. Traitement du Signal, 36(3), 227-231.

Crossref

Google Scholar

[9]

Hathaway, R. J., Bezdek, J. C. 2001. Fuzzy c-means clustering of incomplete data. Systems Man & Cybernetics Part B Cybernetics IEEE Transactions on, 31(5), 735-744.

Digital Library

Google Scholar

[10]

Jimenez-Castao, C., Alvarez-Meza, A., Orozco-Gutierrez, A. 2020. Enhanced automatic twin support vector machine for imbalanced data classification. Pattern Recognition, 107442.

Google Scholar

[11]

Li, X. F., Li, J., Dong, Y. F., Qu, C. W. 2012. A new learning algorithm for imbalanced data—pcboost. Chinese Journal of Computers, 35(2), 202-209.

Crossref

Google Scholar

[12]

UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets.html

Google Scholar

[13]

Gomer, B. 2019. MCAR, MAR, and MNAR values in the same dataset: a realistic evaluation of methods for handling missing data. Multivariate Behavioral Research, 1-1.

Google Scholar

Cited By

View all

Yuen KZhao ZMu HShao W(2024)Analytical Bayesian Copula‐Based Uncertainty Quantification (A‐BASIC‐UQ) Using Data with Missing Values in Structural Health MonitoringStructural Control and Health Monitoring10.1155/2024/54105812024:1Online publication date: 30-Jun-2024
https://doi.org/10.1155/2024/5410581

Recommendations

Takagi-Sugeno Modeling for missing value imputations based on RReliefF Iterative Learning
ICCDA '21: Proceedings of the 2021 5th International Conference on Compute and Data Analysis

The presence of missing values in incomplete dataset increases the difficulty of data mining. This paper, therefore, proposes a Takagi-Sugeno fuzzy model based on RReliefF Iterative Learning (TS-RIL) method for missing value imputations. Since the ...
Fuzzy neuron modeling of incomplete data for missing value imputation
Highlights
- A category-based TS-TRAE model is proposed for incomplete data modeling and missing value imputation.
- An iterative learning method is proposed, which updates the missing value variables and model parameters collaboratively.
- ...
Abstract
Missing values are a common problem found in many real-world datasets, and cannot be avoided. It is a challenging task to model incomplete data and reasonably impute missing values. This paper focuses on regression imputation and uses a tracking-...
Imputations of missing values using a tracking-removed autoencoder trained with incomplete data
Highlights
- Incomplete data are modeled based on the autoencoder for imputations of missing values.
Abstract
The presence of missing values in incomplete datasets increases the difficulty of data mining. In this paper, we use the autoencoder (AE) to model the incomplete data for imputations of missing values, which reduces the complexity of ...

Comments

Information & Contributors

Information

Published In

ICMLSC '21: Proceedings of the 2021 5th International Conference on Machine Learning and Soft Computing

January 2021

178 pages

ISBN:9781450387613

DOI:10.1145/3453800

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Key R&D Program of China
Natural Science Foundation of China

Conference

ICMLSC '21

ICMLSC '21: 2021 The 5th International Conference on Machine Learning and Soft Computing

January 29 - 31, 2021

Da Nang, Viet Nam

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
75
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Yuen KZhao ZMu HShao W(2024)Analytical Bayesian Copula‐Based Uncertainty Quantification (A‐BASIC‐UQ) Using Data with Missing Values in Structural Health MonitoringStructural Control and Health Monitoring10.1155/2024/54105812024:1Online publication date: 30-Jun-2024
https://doi.org/10.1155/2024/5410581

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Cited By

Recommendations

Takagi-Sugeno Modeling for missing value imputations based on RReliefF Iterative Learning

Fuzzy neuron modeling of incomplete data for missing value imputation

Imputations of missing values using a tracking-removed autoencoder trained with incomplete data

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations