skip to main content
10.1145/3453800.3453807acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlscConference Proceedingsconference-collections
research-article

Imbalanced-type Incomplete Data Fuzzy Modeling and Missing Value Imputations

Authors Info & Claims
Published:18 June 2021Publication History

ABSTRACT

Missing values are a common phenomenon in real-world datasets, which caused by many factors such as errors in data acquisition or storage, equipment failure, or human fault in storage. Incomplete data modeling and missing values imputation have become an increasingly important task. Since the regression relationship between attributes is usually different in different clusters, this paper proposes a method called DS-TS-ALI model to model incomplete data that rely on clusters. The precise regression model between attributes is established for incomplete data in the framework of Takagi-Sugeno (TS) fuzzy model. In the premise parameter identification part, a distance density (DS) algorithm based on a partial distance strategy is proposed given the distribution of data categories is imbalanced. Moreover, a membership reconstruction strategy is proposed on this basis. In the consequence parameter identification part, we propose an alternating iterative (ALI) scheme which treats missing values as variables to identify the parameters of the attribute regression model. The imputation will be completed at the end of the modeling process. Experiments on several datasets are conducted to demonstrate the effectiveness of the proposed method.

References

  1. Miao, X., Gao, Y., Guo, S., Liu, W. 2018. Incomplete data management: a survey. Frontiers of Computer ence, 12(1), 4-25.Google ScholarGoogle Scholar
  2. Batista, G. E. A. P. A., Monard, M. C. 2003. A Study of K-Nearest Neighbour as an Imputation Method. His.Google ScholarGoogle Scholar
  3. Butera, N. M., Li, S., Evenson, K. R., Di, C., Herring, A. 2018. Hot deck multiple imputation for handling missing accelerometer data. Statistics in Biosciences, 11(2).Google ScholarGoogle Scholar
  4. Crambes, C., Henchiri, Y. 2018. Regression imputation in the functional linear model with missing values in the response. Journal of Statistical Planning and Inference.Google ScholarGoogle Scholar
  5. Sousa, J. M. C., Kaymak, U. 2002. [world scientific series in robotics and intelligent systems] fuzzy decision making in modeling and control volume 27 || advanced optimization issues.,10.1142/4900, 263-279.Google ScholarGoogle Scholar
  6. Takagi, T., Sugeno, M. 1985. Fuzzy identification of systems and its applications to modeling and control. Systems, Man and Cybernetics, IEEE Transactions on.Google ScholarGoogle Scholar
  7. Zhou, K., Yang, S. 2016. Exploring the uniform effect of FCM clustering: a data distribution perspective. Knowledge-Based Systems, 96(Mar.15), 76-83.Google ScholarGoogle Scholar
  8. Lu, X., Wu, Q., Zhou, Y., Ma, Y., Ma, C. 2019. A dynamic swarm firefly algorithm based on chaos theory and max-min distance algorithm. Traitement du Signal, 36(3), 227-231.Google ScholarGoogle ScholarCross RefCross Ref
  9. Hathaway, R. J., Bezdek, J. C. 2001. Fuzzy c-means clustering of incomplete data. Systems Man & Cybernetics Part B Cybernetics IEEE Transactions on, 31(5), 735-744.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jimenez-Castao, C., Alvarez-Meza, A., Orozco-Gutierrez, A. 2020. Enhanced automatic twin support vector machine for imbalanced data classification. Pattern Recognition, 107442.Google ScholarGoogle Scholar
  11. Li, X. F., Li, J., Dong, Y. F., Qu, C. W. 2012. A new learning algorithm for imbalanced data—pcboost. Chinese Journal of Computers, 35(2), 202-209.Google ScholarGoogle ScholarCross RefCross Ref
  12. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets.htmlGoogle ScholarGoogle Scholar
  13. Gomer, B. 2019. MCAR, MAR, and MNAR values in the same dataset: a realistic evaluation of methods for handling missing data. Multivariate Behavioral Research, 1-1.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICMLSC '21: Proceedings of the 2021 5th International Conference on Machine Learning and Soft Computing
    January 2021
    178 pages
    ISBN:9781450387613
    DOI:10.1145/3453800

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 18 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format