Abstract
Metagenomics is one of the most prolific “omic” sciences in the context of biological research on environmental microbial communities. The studies related to metagenomics generate high-dimensional, sparse, complex, and biologically rich datasets. In this research, we propose a framework which integrates omics-knowledge to identify suitable-reduced set of microbiome features for gaining insights into functional classification of metagenomic sequences. The proposed approach has been applied to two Use Case studies on: - (1) cattle rumen microbiota samples, differentiating nitrate and vegetable oil treated feed for improving cattle performance and (2) human gut microbiota and classifying them in functionally annotated categories of leanness, obesity, or overweight. A high accuracy of 97.5% and Area Under Curve performance value (AUC) of 0.972 was achieved for classifying Bos taurus, cattle rumen microbiota using Logistic Regression (LR) as classification model as well as feature selector in wrapper based strategy for Use Case 1 and 94.4% accuracy with AUC of 1.000, for Use Case 2 on human gut microbiota. In general, LR classifier with wrapper - LR learner as feature selector, proved to be most robust in our analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
MetaPlat (http://www.metaplat.eu) is a 4-year project funded by European Horizon H2020-MSCA-RISE-2015.
References
Hugenholtz, P., Tyson, G.W.: Metagenomics. Nature 455, 481–483 (2008)
McDonald, D., Amanda, B., Knight, R.: Context and the human microbiome. Microbiome 3(1), 52 (2015)
Schuster, S.C.: Next-generation sequencing transforms today’s biology. Nat. Methods 5, 16–18 (2008)
Turnbaugh, P.J., et al.: A core gut microbiome in obese and lean twins. Nature 457, 480–484 (2009)
Belanche, A., et al.: An integrated multi-omics approach reveals the effects of supplementing grass or grass hay with vitamin E on the rumen microbiome and its function. Front. Microbiol. 7, 1–17 (2016)
Roehe, R., et al.: Bovine host genetic variation influences rumen microbial methane production with best selection criterion for low methane emitting and efficiently feed converting hosts based on metagenomic gene abundance. PLoS Genet. 12(2), e1005846 (2016)
Thomas, T., Gilbert, J., Meyer, F.: Metagenomics - a guide from sampling to data analysis. Microb. Inform. Exp. 2, 3 (2012)
Prakash, T., Taylor, D.: Functional assignment of metagenomic data: challenges and applications. Brief. Bioinform. 13(6), 711–727 (2011)
Jonsson, V., Tobias, O., et al.: Statistical evaluation of methods for identification of differentially abundant genes in comparative metagenomics. BMC Genom. 17(1), 78 (2016)
Gonzalez, A., Knight, R.: Advancing analytical algorithms and pipelines for billions of microbial sequences. Curr. Opin. Biotechnol. 23(1), 64–71 (2012)
Mark, H.: Correlation-based feature selection for machine learning. Methodology (1999)
Kotsiantis, S.B., Zaharakis, I.D., Pintelas, P.E.: Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26, 159–190 (2006)
Mark, H., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45, 427–437 (2009)
Acknowledgement
This work was supported in part by Research Strategy Fund of Ulster University and the MetaPlat project, (http://www.metaplat.eu) funded by H2020-MSCA-RISE-2015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wassan, J.T. et al. (2017). An Integrative Approach for the Functional Analysis of Metagenomic Studies. In: Huang, DS., Jo, KH., Figueroa-García, J. (eds) Intelligent Computing Theories and Application. ICIC 2017. Lecture Notes in Computer Science(), vol 10362. Springer, Cham. https://doi.org/10.1007/978-3-319-63312-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-63312-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63311-4
Online ISBN: 978-3-319-63312-1
eBook Packages: Computer ScienceComputer Science (R0)