Implementation of Classification Algorithms on Genomic Data in Order to Determine the Diagnosis of Patients at Risk of Developing Breast Cancer

Chichande, Bryan Steven Cortez; Pino, Ariosto Vicuña; Ordoñez, Jessica Ponce

doi:10.1007/978-3-031-83210-9_1

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2346))

Included in the following conference series:

International Conference on Advanced Research in Technologies, Information, Innovation and Sustainability

Abstract

Throughout history, people’s health has been linked to internal and external factors that influence their socio-economic environment. A clear example is breast cancer, which has its origins in risk factors related to physical inactivity, weight gain and alcohol consumption, among others. The majority of predicted cases are female and a small proportion are male. The penetration of technology in most sciences and fields of work has increased the positive progress in solving complex problems. Big data applied to health allows the discovery of relevant information derived from data related to diseases, prognoses and treatments. The main objective of this research is to determine the diagnosis of breast cancer based on the results of classification models applied to genomic data. The research methodology will be quantitative, with measurable and verifiable results. The development methodology used is a modification of the incremental methodology, with flexible steps to verify and modify the results obtained in each activity. The experimentation tool used is RStudio together with the R programming language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agresti, A.: An introduction to categorical data analysis: Second edition. pp. 1–356 (8 2006). https://doi.org/10.1002/0470114754, https://onlinelibrary.wiley.com/doi/book/10.1002/0470114754
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Cervantes, M.: Salud y enfermedad, una realidad compleja. Contribuciones desde Coatepec pp. 101–116 (2011). https://revistacoatepec.uaemex.mx/article/view/218/213
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
European Commission: Communication from the commission to the council and the European parliament: ehealth action plan 2004-2011. Official Journal of the European Union C 229(5), 1–35 (2004). https://ec.europa.eu/information_society/doc/qualif/health/COM_2004_0356_F_EN_ACTE.pdf
Giaquinto, A.N., et al.: Breast cancer statistics, 2022. Cancer J. Clin. 72, 524–541 (2022). https://doi.org/10.3322/CAAC.21754, https://acsjournals.onlinelibrary.wiley.com/doi/epdf/10.3322/caac.21754
Larose, D.T., Larose, C.D.: K-nearest neighbor algorithm (2014). https://onlinelibrary.wiley.com/doi/10.1002/0471687545.ch5
Li, S.Z., Jain, A. (eds.): LDA (Linear Discriminant Analysis), pp. 899–899. Springer US, Boston, MA (2009). https://doi.org/10.1007/978-0-387-73003-5_349
Naseem, U., et al.: An automatic detection of breast cancer diagnosis and prognosis based on machine learning using ensemble of classifiers. IEEE Access 10, 78242–78252 (2022). https://doi.org/10.1109/ACCESS.2022.3174599
Article MATH Google Scholar
Organización Panamericana de la Salud (OPS): Cáncer de mama. https://www.paho.org/es/temas/cancer-mama (2021). Accessed 11 Aug 2023
Patrício, M., et al.: Using resisting, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer 18, 1–18 (2018). https://doi.org/10.1186/s12885-017-3877-1
Article MATH Google Scholar
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009). https://doi.org/10.4249/scholarpedia.1883
Article MATH Google Scholar
Ren, J., Lee, S.D., Chen, X., Kao, B., Cheng, R., Cheung, D.: Naive Bayes classification of uncertain data. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 944–949 (2009). https://doi.org/10.1109/ICDM.2009.90
Scott, A.J., Hosmer, D.W., Lemeshow, S.: Applied logistic regression. Biometrics 47, 1632 (1991). https://doi.org/10.2307/2532419
Article MATH Google Scholar
de Salud, S., de México, G.: Información estadística cáncer de mama. https://www.gob.mx/salud%7Ccnegsr/acciones-y-programas/informacion-estadistica-cancer-de-mama (2016). Accessed 11 Aug 2023
Sen, P.C., Hajra, M., Ghosh, M.: Supervised classification algorithms in machine learning: a survey and review. Adv. Intell. Syst. Comput. 937, 99–111 (2020)
Article MATH Google Scholar
Si, S., Zhang, H., Keerthi, S.S., Mahajan, D., Dhillon, I.S., Hsieh, C.J.: Gradient boosted decision trees for high dimensional sparse output. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3182–3190. PMLR, 06–11 August 2017. https://proceedings.mlr.press/v70/si17a.html
Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Procedia Comput. Sci. 132, 1578–1585 (2018). https://doi.org/10.1016/J.PROCS.2018.05.122
Article MATH Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 1, 45–66 (2002). https://doi.org/10.1162/153244302760185243
Article MATH Google Scholar
Zhou, Z.: Breast cancer diagnosis with machine learning. Highlights. Sci. Eng. Technol. 9, 73–75 (2022). https://doi.org/10.54097/HSET.V9I.1718
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Quevedo State Technical University, Quevedo Los Ríos, 120150, Ecuador
Bryan Steven Cortez Chichande, Ariosto Vicuña Pino & Jessica Ponce Ordoñez

Authors

Bryan Steven Cortez Chichande
View author publications
You can also search for this author in PubMed Google Scholar
Ariosto Vicuña Pino
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Ponce Ordoñez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bryan Steven Cortez Chichande .

Editor information

Editors and Affiliations

Universidad Estatal Península de Santa Elena, Santa Elena, Ecuador
Teresa Guarda
University of Minho, Guimarães, Portugal
Filipe Portela
Universidad Andrés Bello, Santiago, Chile
Gustavo Gatica

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chichande, B.S.C., Pino, A.V., Ordoñez, J.P. (2025). Implementation of Classification Algorithms on Genomic Data in Order to Determine the Diagnosis of Patients at Risk of Developing Breast Cancer. In: Guarda, T., Portela, F., Gatica, G. (eds) Advanced Research in Technologies, Information, Innovation and Sustainability. ARTIIS 2024. Communications in Computer and Information Science, vol 2346. Springer, Cham. https://doi.org/10.1007/978-3-031-83210-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-83210-9_1
Published: 13 March 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-83209-3
Online ISBN: 978-3-031-83210-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Implementation of Classification Algorithms on Genomic Data in Order to Determine the Diagnosis of Patients at Risk of Developing Breast Cancer