Skip to main content

A Comparative Study of Data Mining Techniques Applied to Renal-Cell Carcinomas

  • Conference paper
  • First Online:
IoT Technologies for Health Care (HealthyIoT 2021)

Abstract

Despite being one of the deadliest diseases and the enormous evolution in fighting it, the best methods to predict kidney cancer, namely Renal-Cell Carcinomas (RCC), are not well-known. One of the solutions to accelerate the current knowledge about RCC is through the use of Data Mining techniques based on patients' personal and clinical data. Therefore, it is crucial to understand which techniques are the most suitable to extract knowledge about this disease. In this paper, we followed the CRISP-DM methodology to simulate different techniques to determine the ones with the best predictive performance. For this purpose, we used a dataset of 821 records of RCC patients, obtained from The Cancer Genome Atlas. The present work tests different Data Mining techniques, that can be used to predict the 5-year life expectancy of patients with renal cancer and to predict the number of days to death for patients who have a life expectancy of less than 5 years. The results obtained demonstrated that the best algorithm for estimating the vital status at 5 years was Random Forest. This algorithm presented an accuracy of 87.65% and an AUROC of 0.931. For the prediction of days to death, the best performance was obtained with the k-Nearest Neighbors algorithm with a root mean square error of 354.6 days. The work suggested that Data Mining techniques can help to understand the influence of various risk factors on the life expectancy of patients with RCC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sung, H., et al.: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. 71, 209–249 (2021). https://doi.org/10.3322/caac.21660

    Article  Google Scholar 

  2. Hsieh, J.J., et al.: Renal cell carcinoma. Nat. Rev. Dis. Prim. 3, 1–19 (2017). https://doi.org/10.1038/nrdp.2017.9

    Article  Google Scholar 

  3. Choueiri, T.K., Motzer, R.J.: Systemic therapy for metastatic renal-cell carcinoma. N. Engl. J. Med. 376, 354–366 (2017)

    Article  Google Scholar 

  4. Dizman, N., Philip, E.J., Pal, S.K.: Genomic profiling in renal cell carcinoma. Nat. Rev. Nephrol. 16, 435–451 (2020). https://doi.org/10.1038/s41581-020-0301-x

    Article  Google Scholar 

  5. Brierley, J.D., Gospodarowicz, M.K., Wittekind, C. (eds.): TNM Classification of Malignant Tumours. Wiley Blackwell (2017)

    Google Scholar 

  6. National Cancer Institute: Cancer Staging. https://www.cancer.gov/about-cancer/diagnosis-staging/staging. Accessed 08 June 2021

  7. Scelo, G., Larose, T.L.: Epidemiology and risk factors for kidney cancer. J. Clin. Oncol. 36, 3574–3581 (2018). https://doi.org/10.1200/JCO.2018.79.1905

    Article  Google Scholar 

  8. American Cancer Society: Survival Rates for Kidney Cancer. https://www.cancer.org/cancer/kidney-cancer/detection-diagnosis-staging/survival-rates.html. Accessed 08 June 2021

  9. Jagga, Z., Gupta, D.: Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms. BMC Proc. 8, 1–7 (2014). https://doi.org/10.1186/1753-6561-8-S6-S2

    Article  Google Scholar 

  10. Rady, E.-H.A., Anwar, A.S.: Prediction of kidney disease stages using data mining algorithms. Inf. Med. Unlocked. 15, 100178 (2019). https://doi.org/10.1016/j.imu.2019.100178

  11. Ola, A.F.: A model for prediction of kidney cancer using data analytics technique. Am. J. Data Min. Knowl. Discov. 5, 27–36 (2020). https://doi.org/10.11648/j.ajdmkd.20200502.12

  12. Grossman, R.L., et al.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016). https://doi.org/10.1056/nejmp1607591

    Article  Google Scholar 

  13. National Cancer Institute: TCGA Cancers Selected for Study. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga/studied-cancers. Accessed 17 June 2021

  14. RapidMiner. https://rapidminer.com/. Accessed 07 May 2021

  15. Morais, A., Peixoto, H., Coimbra, C., Abelha, A., Machado, J.: Predicting the need of neonatal resuscitation using data mining. In: Procedia Computer Science, pp. 571–576. Elsevier B.V. (2017). https://doi.org/10.1016/j.procs.2017.08.287

  16. Dickie, L., Johnson, C., Adams, S., Negoita, S.: Solid Tumor Rules. National Cancer Institute, Rockville, MD (2020)

    Google Scholar 

  17. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    Article  MATH  Google Scholar 

  18. Peixoto, C., Peixoto, H., Machado, J., Abelha, A., Santos, M.F.: Iron value classification in patients undergoing continuous ambulatory peritoneal dialysis using data mining. In: Proceedings of the 4th International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE), pp. 285–290. SCITEPRESS (2018). https://doi.org/10.5220/0006820802850290

Download references

Acknowledgements

This work is funded by “FCT—Fundação para a Ciência e Tecnologia” within the R&D Units Project Scope: UIDB/00319/2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Peixoto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Duarte, A., Peixoto, H., Machado, J. (2022). A Comparative Study of Data Mining Techniques Applied to Renal-Cell Carcinomas. In: Spinsante, S., Silva, B., Goleva, R. (eds) IoT Technologies for Health Care. HealthyIoT 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 432. Springer, Cham. https://doi.org/10.1007/978-3-030-99197-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99197-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99196-8

  • Online ISBN: 978-3-030-99197-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics