Skip to main content

An Alternating Least Square Based Algorithm for Predicting Patient Survivability

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 996))

Abstract

Breast cancer is the most common cancer to females worldwide. Using machine learning technology to predict breast-cancer patients’ survivability has drawn a lot of research interest. However, it still faces many issues, such as missing-value imputation. As such, the main objective of this paper is to develop a novel imputation algorithm, inspired by the recommendation system. More precisely, features with missing values are regarded as items to be evaluated for recommendation.

Consequently, a matrix factorisation algorithm (Alternating Least Square, ALS) is employed to replace missing values; accordingly, four different prediction strategies based on the ALS result are further discussed. The proposed ALS-based imputation algorithm is evaluated by using a large patient dataset from the Surveillance, Epidemiology, and End Results (SEER) program. Experimental results demonstrates a significant improvement on the survivability prediction, compared to existing methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34, 113–127 (2005)

    Article  Google Scholar 

  2. Liu, Y.Q., Wang, C., Zhang, L.: Neural network based models for predicting breast cancer survivability. Chin. J. Biomed. Eng. 28, 221–225 (2009)

    Google Scholar 

  3. Solti, D., Zhai, H.: Predicting breast cancer patient survival using machine learning. In: ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics, BCB 2013, pp. 704–705. ACM (2013)

    Google Scholar 

  4. Lang, K.M., Little, T.D.: Principled missing data treatments. Prev. Sci. 19, 284–294 (2018). https://doi.org/10.1007/s11121-016-0644-5

    Article  Google Scholar 

  5. Surveillance, Epidemiology, and End Results. http://www.seer.cancer.gov

  6. McGale, P., et al.: Effect of radiotherapy after mastectomy and axillary surgery on 10-year recurrence and 20-year breast cancer mortality: meta-analysis of individual patient data for 8135 women in 22 randomised trials. Lancet (London) 383, 2127–2135 (2014). https://doi.org/10.1016/S0140-6736(14)60488-8

    Article  Google Scholar 

  7. Jia, Y., Sun, C., Liu, Z., Wang, W., Zhou, X.: Primary breast diffuse large B-cell lymphoma: a population-based study from 1975 to 2014. Oncotarget 9, 3956–3967 (2018)

    Google Scholar 

  8. Agarwal, S., Pappas, L., Agarwal, J.: Association between unilateral or bilateral mastectomy and breast cancer death in patients with unilateral ductal carcinoma. Cancer Manag. Res. 9, 649–656 (2017)

    Article  Google Scholar 

  9. Webb-Robertson, B.J.M., et al.: Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 14, 1993–2001 (2015). https://doi.org/10.1021/pr501138h

    Article  Google Scholar 

  10. Jiang, F., Liu, G., Du, J., Sui, Y.: Initialization of K-modes clustering using outlier detection techniques. Inf. Sci. 332, 167–183 (2016). https://doi.org/10.1016/j.ins.2015.11.005

    Article  MATH  Google Scholar 

  11. Brock, G.N., Shaffer, J.R., Blakesley, R.E., Lotz, M.J., Tseng, G.C.: Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes. BMC Bioinf. 9, 1–12 (2008). https://doi.org/10.1186/1471-2105-9-12

    Article  Google Scholar 

  12. Nguyen, L.T., Schmidt, H.A., von Haeseler, A., Minh, B.Q.: IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015). https://doi.org/10.1093/molbev/msu300

    Article  Google Scholar 

  13. Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl.-Based Syst. 74, 28–39 (2015). https://doi.org/10.1016/j.knosys.2014.10.017

    Article  Google Scholar 

  14. Shukla, N., Hagenbuchner, M., Win, K.T., Yang, J.: Breast cancer data analysis for survivability studies and prediction. Comput. Methods Programs Biomed. 155, 199–208 (2018). https://doi.org/10.1016/j.cmpb.2017.12.011

    Article  Google Scholar 

  15. Yamaguchi, Y., Misumi, T., Maruo, K.: A comparison of multiple imputation methods for incomplete longitudinal binary data. J. Biopharm. Stat. 28, 645–667 (2018). https://doi.org/10.1080/10543406.2017.1372772

    Article  Google Scholar 

  16. Bian, Y., Li, H.: Recommendation system based on trusted relation transmission. In: 12th International Conference Intelligent Systems and Knowledge Engineering (ISKE), pp. 1–8. IEEE, November 2017. https://doi.org/10.1109/ISKE.2017.8258843

  17. Nguyen, J., Zhu, M.: Content boosted matrix factorization techniques for recommender systems. Stat. Anal. Data Min.: ASA Data Sci. J. 6, 286–301 (2013). https://doi.org/10.1002/sam.11184

    Article  MathSciNet  Google Scholar 

  18. Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 337–348. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68880-8_32

    Chapter  Google Scholar 

  19. Yang, J., Ma, J.: A structure optimization framework for feed-forward neural networks using sparse representation. Knowl.-Based Syst. 109, 61–70 (2016)

    Article  Google Scholar 

  20. Rokach, L., Maimon, O.: Clustering methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_15

    Chapter  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jie Yang or Khin Than Win .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, Q., Yang, J., Win, K.T., Huang, X. (2019). An Alternating Least Square Based Algorithm for Predicting Patient Survivability. In: Islam, R., et al. Data Mining. AusDM 2018. Communications in Computer and Information Science, vol 996. Springer, Singapore. https://doi.org/10.1007/978-981-13-6661-1_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6661-1_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6660-4

  • Online ISBN: 978-981-13-6661-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics