Abstract
Lung cancer is the most frequent and mortal of all types of cancer for both genders. Approximately 80% of the newly diagnosed lung cancers are non-small cell lung cancers. Early diagnosis improves the chances of survival. Machine learning allows us to process a considerable number of variables involved in this disease. Using metabolites as attributes for the analysis, we can discern lung cancer patients from healthy patients. In addition, machine learning algorithms reveal us which metabolites has a determining contribution in the classification. The objective of this study is to demonstrate the accuracy, sensitivity and specificity of a supervised learning algorithm to classify and predict non-small cell lung cancer, using concentration values found in the serum and plasma metabolome of afflicted and healthy humans. We obtained the dataset from the Metabolomics Work-bench repository, which contains 335 samples and 139 known metabolites detected. Of all the models applied, Random Forest Classifier obtained the highest accuracy. It can classify participants according to diagnosis with >75% accuracy in serum samples. Important serum metabolites for the classification included aspartic acid, fructose, and tocopherol alpha. Cystine, pyruvic acid and tocopherol alpha for plasma. The specified metabolites are strongly associated with this condition, and are potential biomarkers for the disease. By giving clues for an earlier diagnosis, this study remarkably contributes in the field of personalized medicine, and the appreciation of the biological processes of lung cancer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bray, F.: Global cancer statistics 2018. GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424 (2018)
Patil, S.A.: Chest X-ray features extraction for lung cancer classification. J. Sci. Ind. Res. (India) 69(4), 271–277 (2010)
Ammanagi, A.: Sputum cytology in suspected cases of carcinoma of lung (Sputum cytology a poor mans bronchoscopy!). Lung India 29(1), 19 (2012)
Salomon, L.: Prostate Biopsy in the staging of prostate cancer. Prostate Cancer Prostatic Dis 1(2), 54–58 (1997)
Afyon, M.: Liver Biopsy is the gold standard at present, how about tomorrow? Viral Hepatitis J. 22(2), 67–68 (2016)
This American Society of Breast Surgeons: Performance and Practice Guidelines for Excisional Breast Biopsy. American Society of Breast Surgeons, pp. 1–3 (2014)
Manthous, C.: Flexible bronchoscopy (Airway Endoscopy). Am. J. Respir. Crit. Care Med. 191(9), P7 (2015)
Hoeijmakers, F.: Mediastinoscopy for staging of non-small cell lung cancer: surgical performance in The Netherlands. Ann. Thorac. Surg. 107(4), 1024–1031 (2019)
Chojniak, R.: Computed tomography-guided transthoracic needle biopsy of pulmonary nodules. RadiologÃa Brasileira 44(3), 99–106 (2007)
Roessner, U.: What is metabolomics all about? Biotechniques 46(5), 363–365 (2009)
Menni, C.: Targeted metabolomics profiles are strongly correlated with nutritional patterns in women. Metabolomics 9(2), 506–514 (2013)
Floegel, A.: Linking diet, physical activity, cardiorespiratory fitness and obesity to serum metabolite networks: findings from a population-based study. Int. J. Obes. 38(11), 1388–1396 (2014)
Auro, K.: A metabolic view on menopause and ageing. Nat. Commun. 5, 1–11 (2014)
Kochhar, S.: Probing gender-specific metabolism differences in humans by nuclear magnetic resonance-based metabonomics. Anal. Biochem. 352(2), 274–281 (2006)
de Sousa, E.B.: Metabolomics as a promising tool for early osteoarthritis diagnosis. Braz. J. Med. Biol. Res. 50(11), 1–7 (2017)
Mao, X.: Metabolomics in gestational diabetes. Clin. Chim. Acta 475, 116–127 (2017)
Palmnas, M.S.A.: The future of NMR Metabolomics in cancer therapy: towards personalizing treatment and developing targeted drugs? Metabolites 3(2), 373–396 (2013)
Scholkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 1st edn. MIT Press, Cambridge (2001)
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann, Burlington (2016)
Cuperlovic-Culf, M.: Machine learning methods for analysis of metabolic data and metabolic pathway modeling. Metabolites 8, 4 (2018)
McCullough, B.: On the accuracy of linear regression routines in some data mining packages. WIREs Data Min. Knowl. Discov. 9, e1279 (2019)
Sud, M.: Metabolomics workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 44(D1), D463–D470 (2015)
Klupczynska, A.: Evaluation of serum amino acid profiles utility in non-small cell lung cancer detection in Polish population. Lung Cancer 100, 71–76 (2016)
Yuanyuan, W.: Fructose fuels lung adenocarcinoma through GLUT5. Cell Death Dis. 9, 557 (2018)
Alexander, M.: Environmental cystine drives glutamine anaplerosis and sensitizes cancer cells to glutaminase inhibition. eLife 6, e27713 (2017)
Nishith, K.: Serum and plasma metabolomic biomarkers for lung cancer. Bioinformation 13(6), 202–208 (2017)
The ATBC Cancer Prevention Study Group: The alpha-tocopherol, beta-carotene lung cancer prevention study: design, methods, participant characteristics, and compliance. Ann Epidemiol 4(1), 1–10 (1994)
Fani, R.: Origin and evolution of metabolic pathways. Phys. Life Rev. 6(1), 23–52 (2009)
Calabrese, F.: Are there new biomarkers in tissue and liquid biopsies for the early detection of non-small cell lung cancer? J. Clin. Med. 8, 414 (2019)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Rondon-Soto, D., Vela-Anton, P. (2020). Detection of Non-small Cell Lung Cancer Adenocarcinoma Using Supervised Learning Algorithms Applied to Metabolomic Profiles. In: Lossio-Ventura, J.A., Condori-Fernandez, N., Valverde-Rebaza, J.C. (eds) Information Management and Big Data. SIMBig 2019. Communications in Computer and Information Science, vol 1070. Springer, Cham. https://doi.org/10.1007/978-3-030-46140-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-46140-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46139-3
Online ISBN: 978-3-030-46140-9
eBook Packages: Computer ScienceComputer Science (R0)