Skip to main content

Inflammatory Bowel Disease Classification Improvement with Metagenomic Data Binning Using Mean-Shift Clustering

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1306))

Abstract

In the human body, where the greatest concentration of bacteria is the gastrointestinal tract, it is considered to be a diverse and complex microbial population, involving many different diseases. The development of metagenomics has many achievements in evolution and biodiversity. The application of machine learning algorithms to solve metagenomics problems has helped researchers make new advances in the field of personalized medicine, especially the diagnosis and improvement of human health people. In this study, we propose an unattended binning approach combined with Mean-shift algorithm to improve predictive performance. We performed on the Inflammatory Bowel Disease (IDB) dataset with 6 subclasses. This clustering method has improved results when applying deep learning techniques and shows the promising potential of data preprocessing methods when applied on different datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dahlhamer, J.M., Zammitti, E.P., Ward, B.W., Wheaton, A.G., Croft, J.B.: Prevalence of inflammatory bowel disease among adults aged \(\ge \) 18 years - United States. MMWR Morb Mortal Wkly Rep 2016(65), 1166–1169 (2015). https://doi.org/10.15585/mmwr.mm6542a3

    Article  Google Scholar 

  2. Andreani, J., Million, M., Baudoin, J., et al.: Klenkia terrae resistant to DNA extraction in germ-free mice stools illustrates the extraction pitfall faced by metagenomics. Sci. Rep. 10, 10228 (2020). https://doi.org/10.1038/s41598-020-66627-0

    Article  Google Scholar 

  3. Reiman, D., Metwally, A.A., Dai, Y.: PopPhy-CNN: Ation Neural Networks for Metage- nomic D Phylogenetic Tree Embedded Architecture for Convoluata, (2018). https://doi.org/10.1101/257931

  4. Anna, P.C., Will, P.M.R., Martyn, W., Edward, O.P.-K.: A Fast Machine Learning Workflow for Rapid Phenotype Prediction from Whole Shotgun Metagenomes. vol. 33, No. 01: AAAI-19, IAAI-19, EAAI-20, (2019). https://doi.org/10.1609/aaai.v33i01.33019434

  5. Nathan, L., Chelsea, J.-T., Ju, G.Z., Wei, W.: MetaPheno: a critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods, vol. 166, pp. 74–82, ISSN 1046–2023 (2019). https://doi.org/10.1016/j.ymeth.2019.03.003

  6. Harris, Z.N., Dhungel, E., Mosior, M., et al.: Massive metagenomic data analysis using abundance-based machine learning. Biol. Direct. 14, 12 (2019). https://doi.org/10.1186/s13062-019-0242-0

    Article  Google Scholar 

  7. James, B.T., Luczak, B.B., Girgis, H.Z.: MeShClust: an intelligent tool for clustering DNA sequences. Nucleic Acids Res. 46(14), e83 (2018). https://doi.org/10.1093/nar/gky315

    Article  Google Scholar 

  8. Barash, D., Comaniciu, D.: Meanshift clustering for DNA microarray analysis. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference. CSB 2004, Stanford, CA, USA, 2004, pp. 578–579 (2004). https://doi.org/10.1109/CSB.2004.1332503

  9. Sokol, H., Leducq, V., Aschard, H., et al.: Fungal microbiota dysbiosis in IBD. Gut. 66(6), 1039–1048 (2017). https://doi.org/10.1136/gutjnl-2015-310746

    Article  Google Scholar 

  10. Diego, F., et al.: Phylogenetic convolutional neural networks in metagenomics. 19(2), 49 (2018). https://doi.org/10.1186/s12859-018-2033-5

  11. Le Chatelier, E., Nielsen, T., Qin, J., et al.: Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013). https://doi.org/10.1038/nature12506

    Article  Google Scholar 

  12. Thanh, H.N., et al.: Disease classification in metagenomics with 2d embeddings and deep learning. In: Proceedings of CAp, France (2018)

    Google Scholar 

  13. Girgis, H.Z., Mitchell, B.R., Dassopoulos, T., Mullin, G.: Hager G: An intelligent system to detect Crohn’s disease inflammation in Wireless Capsule Endoscopy videos. In: Proceedings IEEE International Symposium Biomed Imaging, pp. 1373–1376 (2010). https://doi.org/10.1109/ISBI.2010.5490253

  14. Hai, T.N., Toan, B.T., Huong, H.L., Trung, P.L., Nghi, C.T.: Improving disease prediction using shallow convolutional neural networks on metagenomic data visualizations based on mean-shift clustering algorithm. Int. J. Adv. Comput. Sci. Appl. (IJACSA), 11(6) (2020). https://doi.org/10.14569/IJACSA.2020.0110607

  15. Xing, L., Zhang, J., Liang, H., Li, Z.: Intelligent recognition of dominant colors for Chinese traditional costumes based on a mean shift clustering method. J. Textile Inst. (2018). https://doi.org/10.1080/00405000.2018.1423896

  16. Thanh H.N., Jean-Daniel, Z.: Enhancing metagenome-based disease prediction by unsupervised binning approaches. In: The 2019 11th International Conference on Knowledge and Systems Engineering (KSEIEEE), pp 381–385, ISBN: 978-1-7281-3003-3, (2019). https://doi.org/10.1109/KSE.2019.8919295

  17. Lo, C., Marculescu, R.: MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks. BMC Bioinform. 20, 314 (2019). https://doi.org/10.1186/s12859-019-2833-2

    Article  Google Scholar 

  18. Rodriguez-Valera, F.: Environmental genomics, the big picture? FEMS Microbiol Lett. 231, 153–158 (2004). https://doi.org/10.1016/S0378-1097(04)00006-0

    Article  Google Scholar 

  19. Edwards, R., Rohwer, F.: Viral metagenomics. Nat. Rev. Microbiol. 3, 504–510 (2005). https://doi.org/10.1038/nrmicro1163

    Article  Google Scholar 

  20. Baghban, H., Rahmani, A.M.: A Heuristic on job scheduling in grid computing environment. In: 2008 Seventh International Conference on Grid and Cooperative Computing, Shenzhen, pp. 141–146 (2008). https://doi.org/10.1109/GCC.2008.22

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Thanh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Phan, N.Y.K., Nguyen, H.T. (2020). Inflammatory Bowel Disease Classification Improvement with Metagenomic Data Binning Using Mean-Shift Clustering. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2020. Communications in Computer and Information Science, vol 1306. Springer, Singapore. https://doi.org/10.1007/978-981-33-4370-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-33-4370-2_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-33-4369-6

  • Online ISBN: 978-981-33-4370-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics