Abstract
Inflammatory bowel diseases can be severe, but with access to metagenomic data, we can diagnose them and take the necessary steps to prevent further complications. The key to identifying the composition in the human body that causes the disease is carefully selecting features from the metagenomic data. Our research has demonstrated that using the Random Forest machine learning technique to rank the relative abundance of features for disease prediction tasks is reliable. We have also discovered that selecting features ranging from 1 to 50 improves the accuracy of diagnosis. In addition, we have performed an intersection on the Top 10, 20, 30, 40, and 50 features to determine which ones appear in all datasets. Our experiments on six inflammatory bowel disease-related datasets have yielded better results than previous studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, J., et al.: Biomaterials for inflammatory bowel disease: treatment, diagnosis and organoids. Appl. Mater. Today 36, 102078 (2024). https://doi.org/10.1016/j.apmt.2024.102078
Dangi, P., et al.: Nanotechnology impacting probiotics and prebiotics: a paradigm shift in nutraceuticals technology. Int. J. Food Microbiol. 388, 110083 (2023). https://doi.org/10.1016/j.ijfoodmicro.2022.110083
Luo, M., Zhang, X., Wu, J., Zhao, J.: Modifications of polysaccharide-based biomaterials under structure-property relationship for biomedical applications. Carbohyd. Polym. 266, 118097 (2021). https://doi.org/10.1016/j.carbpol.2021.118097
Wang, H., Xu, Z., Li, Q., Wu, J.: Application of metal-based biomaterials in wound repair. Engineered Regeneration 2, 137–153 (2021). https://doi.org/10.1016/j.engreg.2021.09.005
Devi, S.G., Fathima, A.A., Radha, S., Arunraj, R., Curtis, W.R., Ramya, M.: A rapid and economical method for efficient DNA extraction from diverse soils suitable for metagenomic applications. PLoS ONE 10(7), e0132441 (2015). https://doi.org/10.1371/journal.pone.0132441
Hassan, M., Essam, T., Megahed, S.: Illumina sequencing and assessment of new cost-efficient protocol for metagenomic-DNA extraction from environmental water samples. Braz. J. Microbiol. 49, 1–8 (2018). https://doi.org/10.1016/j.bjm.2018.03.002
Chandrasiri, S., Perera, T., Dilhara, A., Perera, I., Mallawaarachchi, V.: CH-Bin: a convex hull based approach for binning metagenomic contigs. Comput. Biol. Chem. 100, 107734 (2022). https://doi.org/10.1016/j.compbiolchem.2022.107734
de Flamingh, A., et al.: Combining methods for non-invasive fecal DNA enables whole genome and metagenomic analyses in wildlife biology. Front. Genet. 13 (2023). https://doi.org/10.3389/fgene.2022.1021004
Liu, D., et al.: Multicenter assessment of shotgun metagenomics for pathogen detection. eBioMedicine 74, 103649 (2021). https://doi.org/10.1016/j.ebiom.2021.103649
Ma, J., Xu, F., Rong, X.: Discriminative multi-label feature selection with adaptive graph diffusion. Pattern Recogn. 148, 110154 (2024). https://doi.org/10.1016/j.patcog.2023.110154
Zulfiker, M.S., Kabir, N., Biswas, A.A., Nazneen, T., Uddin, M.S.: An in-depth analysis of machine learning approaches to predict depression. Curr. Res. Behav. Sci. 2, 100044 (2021). https://doi.org/10.1016/j.crbeha.2021.100044
Piernik, M., Morzy, T.: A study on using data clustering for feature extraction to improve the quality of classification. Knowl. Inf. Syst. 63(7), 1771–1805 (2021). https://doi.org/10.1007/s10115-021-01572-6
Samareh-Jahani, M., Saberi-Movahed, F., Eftekhari, M., Aghamollaei, G., Tiwari, P.: Low-redundant unsupervised feature selection based on data structure learning and feature orthogonalization. Expert Syst. Appl. 240, 122556 (2024). https://doi.org/10.1016/j.eswa.2023.122556
Hu, Y., et al.: A federated feature selection algorithm based on particle swarm optimization under privacy protection. Knowl.-Based Syst. 260, 110122 (2023). https://doi.org/10.1016/j.knosys.2022.110122
Solorio-Fernández, S., Carrasco-Ochoa, J.A., MartĂnez-Trinidad, J.F.: Filter unsupervised spectral feature selection method for mixed data based on a new feature correlation measure. Neurocomputing 571, 127111 (2024). https://doi.org/10.1016/j.neucom.2023.127111
Al-Ajlan, A., El Allali, A.: Feature selection for gene prediction in metagenomic fragments. BioData Min. 11(1) (2018). https://doi.org/10.1186/s13040-018-0170-z
Qian, W., Xiong, Y., Ding, W., Huang, J., Vong, C.M.: Label correlations-based multi-label feature selection with label enhancement. Eng. Appl. Artif. Intell. 127, 107310 (2024). https://doi.org/10.1016/j.engappai.2023.107310
He, Z., Lin, Y., Wang, C., Guo, L., Ding, W.: Multi-label feature selection based on correlation label enhancement. Inf. Sci. 647, 119526 (2023). https://doi.org/10.1016/j.ins.2023.119526
Fioravanti, D., Giarratano, Y., Maggio, V., Agostinelli, C., Chierici, M., Jurman, G., Furlanello, C.: Phylogenetic convolutional neural networks in metagenomics. BMC Bioinform. 19(S2) (2018). https://doi.org/10.1186/s12859-018-2033-5
Srivastava, A., Kataria, A., Yadav, D.K., Han, I., Choi, E.H.: Interplay of alpha-synuclein pathology and gut microbiome in Parkinson’s disease, pp. 159–178. Elsevier (2022). https://doi.org/10.1016/B978-0-323-91313-3.00003-9
Phan, N.Y.K., Nguyen, H.T.: Inflammatory bowel disease classification improvement with metagenomic data binning using mean-shift clustering, pp. 294–308. Springer, Singapore (2020). https://doi.org/10.1007/978-981-33-4370-2_21
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, H.T.T., Le, H.N., Nguyen, H.T. (2024). Feature Selection Based on Ranking Metagenomic Relative Abundance for Inflammatory Bowel Disease Prediction. In: Barolli, L. (eds) Complex, Intelligent and Software Intensive Systems. CISIS 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-70011-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-70011-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70010-1
Online ISBN: 978-3-031-70011-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)