Abstract
Understanding disease-related metabolite interactions is a key issue in computational biology. We apply a modified Bayesian Optimization Algorithm to targeted metabolomics data from plasma samples of insulin-sensitive and -resistant subjects both suffering from non-alcoholic fatty liver disease. In addition to improving the classification accuracy by selecting relevant features, we extract the information that led to their selection and reconstruct networks from detected feature dependencies. We compare the influence of a variety of classifiers and different scoring metrics and examine whether the reconstructed networks represent physiological metabolite interconnections. We find that the presented method is capable of significantly improving the classification accuracy of otherwise hardly classifiable metabolomics data and that the detected metabolite dependencies can be mapped to physiological pathways, which in turn were affirmed by literature from the domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Atkinson, A., Colburn, W., DeGruttola, V., DeMets, D., Downing, G., Hoth, D., Oates, J., Peck, C., Schooley, R., Spilker, B., et al.: Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clinical Pharmacology & Therapeutics 69(3), 89–95 (2001)
Bang, J., Crockford, D., Holmes, E., Pazos, F., Sternberg, M., Muggleton, S., Nicholson, J.: Integrative top-down system metabolic modeling in experimental disease states via data-driven Bayesian methods. The Journal of Proteome Research 7(2), 497–503 (2008)
Ben-Gal, I.: Bayesian networks. Encyclopedia of Statistics in Quality and Reliability (2007)
Chickering, D.: Learning Bayesian networks is NP-complete. Learning from data: Artificial intelligence and statistics 112, 121–130 (1996)
Cleary, J., Trigg, L.: K*: An Instance-based Learner Using an Entropic Distance Measure. In: Proceedings of the 12th International Conference on Machine Learning, pp. 108–114 (1995)
Doak, J.: An evaluation of feature-selection methods and their application to computer security (Technical Report CSE-92-18). Davis: University of California, Department of Computer Science (1992)
Echegoyen, C., Lozano, J., Santana, R., Larranaga, P.: Exact Bayesian network learning in estimation of distribution algorithms. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 1051–1058. IEEE (2007)
Franken, H., Lehmann, R., Häring, H., Fritsche, A., Stefan, N., Zell, A.: Wrapper-and Ensemble-Based Feature Subset Selection Methods for Biomarker Discovery in Targeted Metabolomics. Pattern Recognition in Bioinformatics, 121–132 (2011)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Hall, M.: Correlation-based Feature Selection for Machine Learning. Ph.D. thesis, The University of Waikato (1999)
Huffman, K., Shah, S., Stevens, R., Bain, J., Muehlbauer, M., Slentz, C., Tanner, C., Kuchibhatla, M., Houmard, J., Newgard, C., et al.: Relationships between circulating metabolic intermediates and insulin action in overweight to obese, inactive men and women. Diabetes Care 32(9), 1678 (2009)
Inza, I., Larranaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by Bayesian network-based optimization. Artificial Intelligence 123(1-2), 157–184 (2000)
Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., Hirakawa, M.: Kegg for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research 38(Database issue), D355–D360 (2010)
Kira, K., Rendell, L.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 129–134. AAAI Press (1992)
Kronfeld, M., Planatscher, H., Zell, A.: The EvA2 optimization framework. Learning and Intelligent Optimization, 247–250 (2010)
Krumsiek, J., Suhre, K., Illig, T., Adamski, J., Theis, F.: Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Systems Biology 5, 21 (2011)
Lim, T.: A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Machine Learning 40, 203–228 (2000)
Masseglia, F., Poncelet, P., Teisseire, M.: Successes and new directions in data mining. Information Science Publishing (2008)
Newgard, C., An, J., Bain, J., Muehlbauer, M., Stevens, R., Lien, L., Haqq, A., Shah, S., Arlotto, M., Slentz, C., et al.: A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metabolism 9(4), 311–326 (2009)
Pelikan, M., Goldberg, D.: Hierarchical bayesian optimization algorithm, vol. 33, p. 63. Springer, Heidelberg (2006)
Pelikan, M., Goldberg, D., Cantu-Paz, E.: BOA: The Bayesian optimization algorithm (IlliGAL Report No. 99003). University of Illinois at Urbana-Champaign, Urbana (1999)
Petersen, K., Dufour, S., Befroy, D., Lehrke, M., Hendler, R., Shulman, G.: Reversal of Nonalcoholic Hepatic Steatosis, Hepatic Insulin Resistance, and Hyperglycemia by Moderate Weight Reduction in Patients With Type 2 Diabetes. Metabolism 54, 603–608 (2005)
Puri, P., Baillie, R.A., Wiest, M.M., Mirshahi, F., Choudhury, J., Cheung, O., Sargeant, C., Contos, M.J., Sanyal, A.J.: A lipidomic analysis of nonalcoholic fatty liver disease. Hepatology 46(4), 1081–1090 (2007)
Saeys, Y., Inza, I.N., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Adaptive Computation and Machine Learning, 1st edn. The MIT Press (2001)
Stefan, N., Kantartzis, K., Häring, H.U.: Causes and metabolic consequences of Fatty liver. Endocrine Reviews 29(7), 939–960 (2008)
Zou, W., Tolstikov, V.: Probing genetic algorithms for feature selection in comprehensive metabolic profiling approach. Rapid Communications in Mass Spectrometry 22(8), 1312–1324 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Franken, H., Seitz, A., Lehmann, R., Häring, HU., Stefan, N., Zell, A. (2012). Inferring Disease-Related Metabolite Dependencies with a Bayesian Optimization Algorithm. In: Giacobini, M., Vanneschi, L., Bush, W.S. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2012. Lecture Notes in Computer Science, vol 7246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29066-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-29066-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29065-7
Online ISBN: 978-3-642-29066-4
eBook Packages: Computer ScienceComputer Science (R0)