Skip to main content

Implementation of Classification Algorithms on Genomic Data in Order to Determine the Diagnosis of Patients at Risk of Developing Breast Cancer

  • Conference paper
  • First Online:
Advanced Research in Technologies, Information, Innovation and Sustainability (ARTIIS 2024)

Abstract

Throughout history, people’s health has been linked to internal and external factors that influence their socio-economic environment. A clear example is breast cancer, which has its origins in risk factors related to physical inactivity, weight gain and alcohol consumption, among others. The majority of predicted cases are female and a small proportion are male. The penetration of technology in most sciences and fields of work has increased the positive progress in solving complex problems. Big data applied to health allows the discovery of relevant information derived from data related to diseases, prognoses and treatments. The main objective of this research is to determine the diagnosis of breast cancer based on the results of classification models applied to genomic data. The research methodology will be quantitative, with measurable and verifiable results. The development methodology used is a modification of the incremental methodology, with flexible steps to verify and modify the results obtained in each activity. The experimentation tool used is RStudio together with the R programming language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agresti, A.: An introduction to categorical data analysis: Second edition. pp. 1–356 (8 2006). https://doi.org/10.1002/0470114754, https://onlinelibrary.wiley.com/doi/book/10.1002/0470114754

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324

  3. Cervantes, M.: Salud y enfermedad, una realidad compleja. Contribuciones desde Coatepec pp. 101–116 (2011). https://revistacoatepec.uaemex.mx/article/view/218/213

  4. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  5. European Commission: Communication from the commission to the council and the European parliament: ehealth action plan 2004-2011. Official Journal of the European Union C 229(5), 1–35 (2004). https://ec.europa.eu/information_society/doc/qualif/health/COM_2004_0356_F_EN_ACTE.pdf

  6. Giaquinto, A.N., et al.: Breast cancer statistics, 2022. Cancer J. Clin. 72, 524–541 (2022). https://doi.org/10.3322/CAAC.21754, https://acsjournals.onlinelibrary.wiley.com/doi/epdf/10.3322/caac.21754

  7. Larose, D.T., Larose, C.D.: K-nearest neighbor algorithm (2014). https://onlinelibrary.wiley.com/doi/10.1002/0471687545.ch5

  8. Li, S.Z., Jain, A. (eds.): LDA (Linear Discriminant Analysis), pp. 899–899. Springer US, Boston, MA (2009). https://doi.org/10.1007/978-0-387-73003-5_349

  9. Naseem, U., et al.: An automatic detection of breast cancer diagnosis and prognosis based on machine learning using ensemble of classifiers. IEEE Access 10, 78242–78252 (2022). https://doi.org/10.1109/ACCESS.2022.3174599

    Article  MATH  Google Scholar 

  10. Organización Panamericana de la Salud (OPS): Cáncer de mama. https://www.paho.org/es/temas/cancer-mama (2021). Accessed 11 Aug 2023

  11. Patrício, M., et al.: Using resisting, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer 18, 1–18 (2018). https://doi.org/10.1186/s12885-017-3877-1

    Article  MATH  Google Scholar 

  12. Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009). https://doi.org/10.4249/scholarpedia.1883

    Article  MATH  Google Scholar 

  13. Ren, J., Lee, S.D., Chen, X., Kao, B., Cheng, R., Cheung, D.: Naive Bayes classification of uncertain data. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 944–949 (2009). https://doi.org/10.1109/ICDM.2009.90

  14. Scott, A.J., Hosmer, D.W., Lemeshow, S.: Applied logistic regression. Biometrics 47, 1632 (1991). https://doi.org/10.2307/2532419

    Article  MATH  Google Scholar 

  15. de Salud, S., de México, G.: Información estadística cáncer de mama. https://www.gob.mx/salud%7Ccnegsr/acciones-y-programas/informacion-estadistica-cancer-de-mama (2016). Accessed 11 Aug 2023

  16. Sen, P.C., Hajra, M., Ghosh, M.: Supervised classification algorithms in machine learning: a survey and review. Adv. Intell. Syst. Comput. 937, 99–111 (2020)

    Article  MATH  Google Scholar 

  17. Si, S., Zhang, H., Keerthi, S.S., Mahajan, D., Dhillon, I.S., Hsieh, C.J.: Gradient boosted decision trees for high dimensional sparse output. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3182–3190. PMLR, 06–11 August 2017. https://proceedings.mlr.press/v70/si17a.html

  18. Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Procedia Comput. Sci. 132, 1578–1585 (2018). https://doi.org/10.1016/J.PROCS.2018.05.122

    Article  MATH  Google Scholar 

  19. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 1, 45–66 (2002). https://doi.org/10.1162/153244302760185243

    Article  MATH  Google Scholar 

  20. Zhou, Z.: Breast cancer diagnosis with machine learning. Highlights. Sci. Eng. Technol. 9, 73–75 (2022). https://doi.org/10.54097/HSET.V9I.1718

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bryan Steven Cortez Chichande .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chichande, B.S.C., Pino, A.V., Ordoñez, J.P. (2025). Implementation of Classification Algorithms on Genomic Data in Order to Determine the Diagnosis of Patients at Risk of Developing Breast Cancer. In: Guarda, T., Portela, F., Gatica, G. (eds) Advanced Research in Technologies, Information, Innovation and Sustainability. ARTIIS 2024. Communications in Computer and Information Science, vol 2346. Springer, Cham. https://doi.org/10.1007/978-3-031-83210-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-83210-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-83209-3

  • Online ISBN: 978-3-031-83210-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics