Skip to main content

Kent feature embedding for classification of compositional data with zeros

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Compositional data have posed challenges to current classification methods owing to the non-negative and unit-sum constraints, especially when a certain of the components are zeros. In this paper, we develop an effective classification method for multivariate compositional data with certain of the components equal to zero. Specifically, a Kent feature embedding technique is first proposed to transform compositional data and improve data quality. We then use support vector machine as the state-of-the-art machine learning model to build the classifier. The proposed method is proved to be effective through numerical simulations. Results on multiple real datasets, including species classification, day-night image classification and household’s consumption pattern recognition, further verify that the proposed method can achieve good classification performance and outperform the other competitors. This method would help to broaden the practical usage of compositional data with zeros in the task of classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility statement

Data are public available and details are given in the paper. Data can also be made available on reasonable request.

Notes

  1. To explicitly showcase the proposed Kent feature embedding, the corresponding pseudocode is depicted in Algorithm 2, conveniently placed in the Appendix to maintain the paper’s conciseness.

References

Download references

Funding

This study is funded by National Natural Science Foundation of China (Nos. 72371257, 72001222, 71873012). RG is partially supported by Humanities and Social Science General Program of the Ministry of Education of China (No. 23YJC910002). SL thanks the support from Jing Ying Scholar Support Program in Central University of Finance and Economics (CUFE) and is a member of Financial Sustainable Development Research Team in CUFE. SL, WW and RG also thank the support from Program for Innovation Research, the “Double First-Class” Disciplinary Project and the Disciplinary Funding in CUFE.

Author information

Authors and Affiliations

Authors

Contributions

SL: Conceptualization; Methodology; Formal analysis; Writing—original draft; Writing—review & editing. WW: Formal analysis; Writing—review & editing. RG: Conceptualization; Methodology; Writing—review & editing.

Corresponding author

Correspondence to Rong Guan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Algorithm 2
figure b

Kent feature embedding

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, S., Wang, W. & Guan, R. Kent feature embedding for classification of compositional data with zeros. Stat Comput 34, 69 (2024). https://doi.org/10.1007/s11222-024-10382-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10382-z

Keywords