Abstract
Metagenomics analysis has increased its importance in medicine with numerous recent research to investigate and explore the association of metagenomic data to human disease. Discretization approaches are proven as efficient tools to improve the disease prediction performance on metagenomic data. This study proposes a technique based on Entropy and combining some scaler algorithms to conduct bins for discretizing metagenomic data to perform disease classification tasks. Our disease prediction results on six bacterial species abundance metagenomic datasets with the discretization method based on Entropy have revealed promising results compared to the Equal Width Binning with AUCs of 0.955, 0.826, 0.893, 0.692, 0.798, 0.765 classified by a One-dimensional Convolutional Neural Network on data including samples related to Liver Cirrhosis, Colorectal Cancer, Inflammatory Bowel Disease (IBD), and two datasets of Type 2 Diabetes (namely, T2D, and WT2D), respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Vicente, A.M., Ballensiefen, W., Jönsson, J.I.: How personalised medicine will transform healthcare by 2030: the ICPerMed vision. J. Transl. Med. 18(1) (2020). https://doi.org/10.1186%2Fs12967-020-02316-w
Pemovska, T., et al.: Individualized systems medicine strategy to tailor treatments for patients with chemorefractory acute myeloid leukemia. Cancer Discov. 3(12), 1416–1429 (2013). https://doi.org/10.1158%2F2159-8290.cd-13-0350
Sebri, V., Savioni, L.: An introduction to personalized eHealth. In: Pravettoni, G., Triberti, S. (eds.) P5 eHealth: An Agenda for the Health Technologies of the Future, pp. 53–70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-27994-3_4
Petrosino, J.F.: The microbiome in precision medicine: the way forward. Genome Med. 10(1) (February 2018). https://doi.org/10.1186%2Fs13073-018-0525-6
Gilbert, J.A., et al.: Microbiome-wide association studies link dynamic microbial consortia to disease. Nature 535(7610), 94–103 (2016). https://doi.org/10.1038%2Fnature18850
Chen, H., Awasthi, S.K., Liu, T., Zhang, Z., Awasthi, M.K.: An assessment of the functional enzymes and corresponding genes in chicken manure and wheat straw composted with addition of clay via meta-genomic analysis. Indus. Crops Prod. 153, 112573 (2020). https://doi.org/10.1016%2Fj.indcrop.2020.112573
Guerron, A.D., Perez, J.E., Risoli, T., Lee, H.J., Portenier, D., Corsino, L.: Performance and improvement of the DiaRem score in diabetes remission prediction: a study with diverse procedure types. Surg. Obes. Relat. Dis. 16(10), 1531–1542 (2020). https://doi.org/10.1016%2Fj.soard.2020.05.010
Tran, T.B., Phan, N.Y.K., Nguyen, H.T.: Feature selection based on a shallow convolutional neural network and saliency maps on metagenomic data. In: Kim, H., Kim, K.J., Park, S. (eds.) Information Science and Applications. LNEE, vol. 739, pp. 107–116. Springer, Singapore (2021). https://doi.org/10.1007/978-981-33-6385-4_10
Lin, Y., Wang, G., Yu, J., Sung, J.J.Y.: Artificial intelligence and metagenomics in intestinal diseases. J. Gastroenterol. Hepatol. 36(4), 841–847 (2021), https://doi.org/10.1111/jgh.15501
Ren, J., et al.: Identifying viruses from metagenomic data using deep learning. Quant. Biol. 8(1), 64–77 (2020). https://doi.org/10.1007/s40484-019-0187-4
Nguyen, H.T., Tran, T.B., Luong, H.H., Huynh, T.K.N.: Decoders configurations based on unet family and feature pyramid network for COVID-19 segmentation on CT images. PeerJ Comput. Sci. 7,(2021). https://doi.org/10.7717/peerj-cs.719
Li, L., Delwart, E.: From orphan virus to pathogen: the path to the clinical lab. Curr. Opin. Virol. 1(4), 282–288 (2011). https://doi.org/10.1016/j.coviro.2011.07.006
Reiman, D., Metwally, A.A., Sun, J., Dai, Y.: Popphy-cnn: a phylogenetic tree embedded architecture for convolutional neural networks to predict host phenotype from metagenomic data. IEEE J. Biomed. Health Inf. 24(10), 2993–3001 (2020)
LaPierre, N., Ju, C.J.T., Zhou, G., Wang, W.: MetaPheno: a critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods 166, 74–82 (2019). https://doi.org/10.1016%2Fj.ymeth.2019.03.003
Auslander, N., Gussow, A.B., Benler, S., Wolf, Y.I., Koonin, E.V.: Seeker: alignment-free identification of bacteriophage genomes by deep learning (April 2020). https://doi.org/10.1101/2020.04.04.025783
Oh, M., Zhang, L.: DeepMicro: deep representation learning for disease prediction based on microbiome data. Sci. Rep. 10(1) (Apr 2020). https://doi.org/10.1038%2Fs41598-020-63159-5
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2015)
Pasolli, E., Truong, D.T., Malik, F., Waldron, L., Segata, N.: Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLOS Comput. Biol. 12(7), e1004977 (2016). https://doi.org/10.1371%2Fjournal.pcbi.1004977
Nguyen, T.H., Zucker, J.D.: Enhancing metagenome-based disease prediction by unsupervised binning approaches. In: 2019 11th International Conference on Knowledge and Systems Engineering (KSE), IEEE (October 2019). https://doi.org/10.1109%2Fkse.2019.8919295
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Phan, N.Y.K., Tran, T.B., Nguyen, H.H., Nguyen, H.T. (2021). Entropy-Based Discretization Approach on Metagenomic Data for Disease Prediction. In: Dang, T.K., Küng, J., Chung, T.M., Takizawa, M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2021. Communications in Computer and Information Science, vol 1500. Springer, Singapore. https://doi.org/10.1007/978-981-16-8062-5_25
Download citation
DOI: https://doi.org/10.1007/978-981-16-8062-5_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8061-8
Online ISBN: 978-981-16-8062-5
eBook Packages: Computer ScienceComputer Science (R0)