research-article

Missing data filling method based on Aitchison simplex space

Author:
Bing Liu

Dazhou vocational and technical college, Dazhou, Sichuan, China

Dazhou vocational and technical college, Dazhou, Sichuan, China
View Profile

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer EngineeringOctober 2021Pages 1341–1344https://doi.org/10.1145/3501409.3501646

Published:31 December 2021Publication History

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

Pages 1341–1344

ABSTRACT

Because the Aitchison simplex spatial data is limited by bounded and definite sum constraints, the data generally does not meet the multivariate normal distribution, and there is a strict or approximate linear relationship between some variables, it is very difficult to establish its data model. In multiple regression analysis, a small change in the sample attribute value will greatly disturb the estimated value of the regression coefficient, resulting in the extremely unstable regression coefficient, and the existing general statistical analysis methods cannot be used to properly interpret and process the data. To solve this problem, based on the relevant definitions of complete algebraic operations in Aitchison simplex space, this paper proposes a filling method based on missing data in simplex space: firstly, the k-means method is used for initial filling in simplex space, then the equidistant logarithm ratio transformation is carried out, and finally the principal component method is used to correct the initial filling value. The example results show that the effect of using the principal component correction filling method based on the proposed complete algebraic operation system of simplex space is better than that of other filling methods.

References

Aitchison, J. (1986) The Statistical Analysis of Compositional Data. Chapman and Hall, London.Google ScholarDigital Library
Buccianti, A. and Pawlowsky-Glahn, V. (2005) New Perspectives on Water Chemistry and Compositional Data Analysis. Mathematical Geology, 37, 703--727. https://doi.org/10.1007/s11004-005-7376-6Google Scholar
Jarautabragulat, E., Hervadasala, C., Egozcue, J.J., et al. (2015) Air Quality Index Revisited from a Compositional Point of View. Mathematical Geosciences, 48, 581--593. https://doi.org/10.1007/s11004-015-9599-5Google ScholarCross Ref
Snyder, R.D., Ord, K., Koehler, A.B., et al. (2015) Fore-casting Compositional Time Series: A State Space Approach. Monash Econometrics and Business Statistics Working Papers, Monash University.Google Scholar
Billheimer, D., Guttorp, P. and Fagan, W.F. (1998) Statistical Analysis and Interpretation of Discrete Compositional Data. National Center for Statistics and the Environment (NRCSE) Technical Report NRCSE-TRS.Google Scholar
Zhang Yaoting. Introduction to statistical analysis of component data [M]. Beijing: Science Press, 2000.Google Scholar
Pawlowsky-Glahn, V., Egozcue, J.J. and Tolosana-Delgado, R. (2015) Modeling and Analysis of Compositional Data. John Wiley & Sons, Ltd.Google ScholarCross Ref
Kynclová, P., Filzmoser, P. and Hron, K. (2015) Modeling Compositional Time Series with Vector Autoregressive Models. Journal of Fore-casting, 34, 303--314. https://doi.org/10.1002/for.2336Google Scholar
Guo Lijuan, Wang Huiwen, Guan Rong. Discriminant analysis of component data based on isometric logarithm transformation [J]. Systems engineering, 2016, 34 (2): 153--158.Google Scholar
Aitchison, J, Barceló-Vidal, C., Egozcue, J.J., et al. (2002) A Concise Guide to the Algebraic-Geometric Structure of the Simplex, the Sample Space for Composi-tional Data Analysis. Proceedings of IAMG, 2, 387--392.Google Scholar
Aitchison J. The Statistical Analysis of Compositional Data[M]. New York: Chapman and Hall, 1986.Google ScholarDigital Library
Hron K, Tempi M, Filzmoser P. Imputation of missing values for compositional data using classical and robust methods [J]. Comput Statist. Data Anal, 2010, 54(12): 3095--3107.Google ScholarDigital Library
Egozcue J J, Pawlowsky-Glahn V, Mateu-Figueras G, et al. Isometric logratio transformations for compositional data analysis [J]. Math. GeoL, 2003, 35(3): 279--300.Google ScholarCross Ref
Wang Songgui, Shi Jianhong, Yi suju, et al. Introduction to linear model [M]. Beijing: Science Press, 2004.Google Scholar
Wang Xing. Nonparametric statistics [M]. Beijing: Tsinghua University Press, 2013.Google Scholar
Yoon D, Lee E K, Park T. Robust imputation method for missing values in microarray data [J]. BMC Bioinformatics, 2007, 8(Suppl 2): S6, 1--7.Google ScholarCross Ref
Filzmoser P, Hron K, Reimann C. Principal component analysis for compositional data with outliers[J]. Environmetrics, 2009, 20(6): 621--632.Google ScholarCross Ref

Index Terms

Missing data filling method based on Aitchison simplex space
1. Computing methodologies
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Principal component analysis for data containing outliers and missing elements

Two approaches are presented to perform principal component analysis (PCA) on data which contain both outlying cases and missing elements. At first an eigendecomposition of a covariance matrix which can deal with such data is proposed, but this approach ...
Read More
A reinforcement learning-based approach for imputing missing data
Abstract
Missing data is a major problem in real-world datasets, which hinders the performance of data analytics. Conventional data imputation schemes such as univariate single imputation replace missing values in each column with the same approximated ...
Read More
Principal component analysis for compositional data vectors

Since Aitchison's founding research work, compositional data analysis has attracted growing attention in recent decades. As a powerful technique for exploratory analysis, principal component analysis (PCA) has been extended to compositional data. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering
October 2021
1723 pages
ISBN:9781450384322
DOI:10.1145/3501409

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 December 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
component data
missing data
principal component analysis
simplex space
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
EITCE '21 Paper Acceptance Rate294of531submissions,55%Overall Acceptance Rate508of972submissions,52%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 21
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Missing data filling method based on Aitchison simplex space

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Principal component analysis for data containing outliers and missing elements

A reinforcement learning-based approach for imputing missing data

Principal component analysis for compositional data vectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Missing data filling method based on Aitchison simplex space

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Principal component analysis for data containing outliers and missing elements

A reinforcement learning-based approach for imputing missing data

Principal component analysis for compositional data vectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media